Summary by Natalia Diaz Rodriguez, PhD 4 days ago
Through a likelihood-focused derivation of a variational inference (VI) loss, Variational Generative Experience Replay (VGER) presents the closest appropriate likelihood- focused alternative to Variational Continual Learning (VCL), the state-of the art prior-focused approach to continual learning.
In non continual learning, the aim is to learn parameters $\omega$ using labelled training data $\mathcal{D}$ to infer $p(y|\omega, x)$. In the continual learning context, instead, the data is not independently and identically distributed (i.i.d.), but may be split into separate tasks $\mathcal{D}_t = (X_t, Y_t)$ whose examples $x_t^{n_t}$ and $y_t^{n_t}$ are assumed to be i.i.d.
In \cite{Farquhar18}, as the loss at time $t$ cannot be estimated for previously discarded datasets, to approximate the distribution of past datasets $p_t(x,y)$, VGER (Variational Generative Experience Replay) trains a GAN $q_t(x, y)$ to produce ($\hat{x}, \hat{y}$) pairs for each class in each dataset as it arrives (generator is kept while data is discarded after each dataset is used). The variational free energy $\mathcal{F}_T$ is used to train on dataset $\mathcal{D}_T$ augmented with samples generated by the GAN. In this way the prior is set as the posterior approximation from the previous task.

more
less