[link]
Through a likelihoodfocused derivation of a variational inference (VI) loss, Variational Generative Experience Replay (VGER) presents the closest appropriate likelihood focused alternative to Variational Continual Learning (VCL), the stateof the art priorfocused approach to continual learning. In non continual learning, the aim is to learn parameters $\omega$ using labelled training data $\mathcal{D}$ to infer $p(y\omega, x)$. In the continual learning context, instead, the data is not independently and identically distributed (i.i.d.), but may be split into separate tasks $\mathcal{D}_t = (X_t, Y_t)$ whose examples $x_t^{n_t}$ and $y_t^{n_t}$ are assumed to be i.i.d. In \cite{Farquhar18}, as the loss at time $t$ cannot be estimated for previously discarded datasets, to approximate the distribution of past datasets $p_t(x,y)$, VGER (Variational Generative Experience Replay) trains a GAN $q_t(x, y)$ to produce ($\hat{x}, \hat{y}$) pairs for each class in each dataset as it arrives (generator is kept while data is discarded after each dataset is used). The variational free energy $\mathcal{F}_T$ is used to train on dataset $\mathcal{D}_T$ augmented with samples generated by the GAN. In this way the prior is set as the posterior approximation from the previous task.
Your comment:
