Continual Learning Literature

This repository is maintained by Massimo Caccia and Timothée Lesort don't hesitate to send us an email to collaborate or fix some entries ({massimo.p.caccia , t.lesort} at gmail.com). The automation script of this repo is adapted from Automatic_Awesome_Bibliography.

For contributing to the repository please follow the process here

Outline

Classics

Catastrophic forgetting in connectionist networks , (1999) by French, Robert M. [bib]
Lifelong robot learning , (1995) by Thrun, Sebastian and Mitchell, Tom M [bib]

Argues knowledge transfer is essential if robots are to learn control with moderate learning times

Catastrophic interference in connectionist networks: The sequential learning problem , (1989) by McCloskey, Michael and Cohen, Neal J [bib]

Introduces CL and reveals the catastrophic forgetting problem

Surveys

GDumb: A Simple Approach that Questions Our Progress in Continual Learning , (2020) by Prabhu, Ameya, Torr, Philip HS and Dokania, Puneet K [bib]

introduces a super simple methods that outperforms almost all methods in all of the CL benchmarks. We need new better benchamrks

Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , (2020) by Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat and Natalia Díaz-Rodríguez [bib]
Continual learning: A comparative study on how to defy forgetting in classification tasks , (2019) by Matthias De Lange, Rahaf Aljundi, Marc Masana, Sarah Parisot, Xu Jia, Ales Leonardis, Gregory Slabaugh and Tinne Tuytelaars [bib]

Extensive empirical study of CL methods (in the multi-head setting)

Continual lifelong learning with neural networks: A review , (2019) by German I. Parisi, Ronald Kemker, Jose L. Part, Christopher Kanan and Stefan Wermter [bib]

An extensive review of CL

Three scenarios for continual learning , (2019) by van de Ven, Gido M and Tolias, Andreas S [bib]

An extensive review of CL methods in three different scenarios (task-, domain-, and class-incremental learning)

Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks , (2018) by Andrea Soltoggio, Kenneth O. Stanley and Sebastian Risi [bib]

Influentials

Efficient Lifelong Learning with A-GEM , (2019) by Chaudhry, Arslan, Ranzato, Marc’Aurelio, Rohrbach, Marcus and Elhoseiny, Mohamed [bib]

More efficient GEM; Introduces online continual learning

Towards Robust Evaluations of Continual Learning , (2018) by Farquhar, Sebastian and Gal, Yarin [bib]

Proposes desideratas and reexamines the evaluation protocol

Continual Learning in Practice , (2018) by Diethe, Tom, Borchert, Tom, Thereska, Eno, Pigem, Borja de Balle and Lawrence, Neil [bib]

Proposes a reference architecture for a continual learning system

Overcoming catastrophic forgetting in neural networks , (2017) by Kirkpatrick, James, Pascanu, Razvan, Rabinowitz, Neil, Veness, Joel, Desjardins, Guillaume, Rusu, Andrei A, Milan, Kieran, Quan, John, Ramalho, Tiago, Grabska-Barwinska, Agnieszka and others [bib]
Gradient Episodic Memory for Continual Learning , (2017) by Lopez-Paz, David and Ranzato, Marc-Aurelio [bib]

A model that alliviates CF via constrained optimization

Continual learning with deep generative replay , (2017) by Shin, Hanul, Lee, Jung Kwon, Kim, Jaehong and Kim, Jiwon [bib]

Introduces generative replay

An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks , (2013) by Goodfellow, I.~J., Mirza, M., Xiao, D., Courville, A. and Bengio, Y. [bib]

Investigates CF in neural networks

New Settings or Metrics

Wandering Within a World: Online Contextualized Few-Shot Learning , (2020) by Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer and Richard S. Zemel [bib]

proposes a new continual few-shot setting where spacial and temporal context can be leveraged to and unseen classes need to be predicted

Defining Benchmarks for Continual Few-Shot Learning , (2020) by Antoniou, Antreas, Patacchiola, Massimiliano, Ochal, Mateusz and Storkey, Amos [bib]

(title is a good enough summary)

Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , (2020) by Caccia, Massimo, Rodriguez, Pau, Ostapenko, Oleksiy, Normandin, Fabrice, Lin, Min, Caccia, Lucas, Laradji, Issam, Rish, Irina, Lacoste, Alexandre, Vazquez, David and others [bib]

Proposes a new approach to CL evaluation more aligned with real-life applications, bringing CL closer to Online Learning and Open-World learning

Compositional Language Continual Learning , (2020) by Yuanpeng Li, Liang Zhao, Kenneth Church and Mohamed Elhoseiny [bib]

method for compositional continual learning of sequence-to-sequence models

Regularization Methods

Continual Learning with Bayesian Neural Networks for Non-Stationary Data , (2020) by Richard Kurle, Botond Cseke, Alexej Klushyn, Patrick van der Smagt and Stephan Günnemann [bib]

continual learning for non-stationary data using Bayesian neural networks and memory-based online variational Bayes

Improving and Understanding Variational Continual Learning , (2019) by Siddharth Swaroop, Cuong V. Nguyen, Thang D. Bui and Richard E. Turner [bib]

Improved results and interpretation of VCL.

Uncertainty-based Continual Learning with Adaptive Regularization , (2019) by Ahn, Hongjoon, Cha, Sungmin, Lee, Donggyu and Moon, Taesup [bib]

Introduces VCL with uncertainty measured for neurons instead of weights.

Functional Regularisation for Continual Learning with Gaussian Processes , (2019) by Titsias, Michalis K, Schwarz, Jonathan, Matthews, Alexander G de G, Pascanu, Razvan and Teh, Yee Whye [bib]

functional regularisation for Continual Learning: avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function

Task Agnostic Continual Learning Using Online Variational Bayes , (2018) by Chen Zeno, Itay Golan, Elad Hoffer and Daniel Soudry [bib]

Introduces an optimizer for CL that relies on closed form updates of mu and sigma of BNN; introduce label trick for class learning (single-head)

Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation , (2018) by Xu He and Herbert Jaeger [bib]

Conceptor-Aided Backprop (CAB): gradients are shielded by conceptors against degradation of previously learned tasks

Riemannian Walk for Incremental Learning: Understanding Forgetting and Intransigence , (2018) by Chaudhry, Arslan, Dokania, Puneet K, Ajanthan, Thalaiyasingam and Torr, Philip HS [bib]

Formalizes the shortcomings of multi-head evaluation, as well as the importance of replay in single-head setup. Presenting an improved version of EWC.

Variational Continual Learning , (2018) by Cuong V. Nguyen, Yingzhen Li, Thang D. Bui and Richard E. Turner [bib]
Progress & compress: A scalable framework for continual learning , (2018) by Schwarz, Jonathan, Luketina, Jelena, Czarnecki, Wojciech M, Grabska-Barwinska, Agnieszka, Teh, Yee Whye, Pascanu, Razvan and Hadsell, Raia [bib]

A new P\&C architecture; online EWC for keeping the knowledge about the previous task, knowledge for keeping the knowledge about the current task (Multi-head setting, RL)

Facilitating Bayesian Continual Learning by Natural Gradients and Stein Gradients , (2018) by Chen, Yu, Diethe, Tom and Lawrence, Neil [bib]

Improves on VCL

Overcoming catastrophic forgetting in neural networks , (2017) by Kirkpatrick, James, Pascanu, Razvan, Rabinowitz, Neil, Veness, Joel, Desjardins, Guillaume, Rusu, Andrei A, Milan, Kieran, Quan, John, Ramalho, Tiago, Grabska-Barwinska, Agnieszka and others [bib]
Memory Aware Synapses: Learning what (not) to forget , (2017) by Rahaf Aljundi, Francesca Babiloni, Mohamed Elhoseiny, Marcus Rohrbach and Tinne Tuytelaars [bib]

Importance of parameter measured based on their contribution to change in the learned prediction function

Continual Learning Through Synaptic Intelligence , (2017) by *Zenke, Friedeman, Poole, Ben and Ganguli, Surya * [bib]

Synaptic Intelligence (SI). Importance of parameter measured based on their contribution to change in the loss.

Distillation Methods

Overcoming Catastrophic Forgetting With Unlabeled Data in the Wild , (2019) by Lee, Kibok, Lee, Kimin, Shin, Jinwoo and Lee, Honglak [bib]

Introducing global distillation loss and balanced finetuning; leveraging unlabeled data in the open world setting (Single-head setting)

Large scale incremental learning , (2019) by Wu, Yue, Chen, Yinpeng, Wang, Lijuan, Ye, Yuancheng, Liu, Zicheng, Guo, Yandong and Fu, Yun [bib]

Introducing bias parameters to the last fully connected layer to resolve the data imbalance issue (Single-head setting)

Lifelong learning via progressive distillation and retrospection , (2018) by Hou, Saihui, Pan, Xinyu, Change Loy, Chen, Wang, Zilei and Lin, Dahua [bib]

Introducing an expert of the current task in the knowledge distillation method (Multi-head setting)

End-to-end incremental learning , (2018) by Castro, Francisco M, Marin-Jimenez, Manuel J, Guil, Nicolas, Schmid, Cordelia and Alahari, Karteek [bib]

Finetuning the last fully connected layer with a balanced dataset to resolve the data imbalance issue (Single-head setting)

Learning without forgetting , (2017) by Li, Zhizhong and Hoiem, Derek [bib]

Functional regularization through distillation (keeping the output of the updated network on the new data close to the output of the old network on the new data)

icarl: Incremental classifier and representation learning , (2017) by Rebuffi, Sylvestre-Alvise, Kolesnikov, Alexander, Sperl, Georg and Lampert, Christoph H [bib]

Binary cross-entropy loss for representation learning & exemplar memory (or coreset) for replay (Single-head setting)

Rehearsal Methods

Efficient Lifelong Learning with A-GEM , (2019) by Chaudhry, Arslan, Ranzato, Marc’Aurelio, Rohrbach, Marcus and Elhoseiny, Mohamed [bib]

More efficient GEM; Introduces online continual learning

Orthogonal Gradient Descent for Continual Learning , (2019) by Mehrdad Farajtabar, Navid Azizan, Alex Mott and Ang Li [bib]

projecting the gradients from new tasks onto a subspace in which the neural network output on previous task does not change and the projected gradient is still in a useful direction for learning the new task

Gradient based sample selection for online continual learning , (2019) by Aljundi, Rahaf, Lin, Min, Goujaud, Baptiste and Bengio, Yoshua [bib]

sample selection as a constraint reduction problem based on the constrained optimization view of continual learning

Online Continual Learning with Maximal Interfered Retrieval , (2019) by Aljundi, Rahaf, Caccia, Lucas, Belilovsky, Eugene, Caccia, Massimo, Lin, Min, Charlin, Laurent and Tuytelaars, Tinne [bib]

Controlled sampling of memories for replay to automatically rehearse on tasks currently undergoing the most forgetting

Online Learned Continual Compression with Adaptative Quantization Module , (2019) by Caccia, Lucas, Belilovsky, Eugene, Caccia, Massimo and Pineau, Joelle [bib]

Uses stacks of VQ-VAE modules to progressively compress the data stream, enabling better rehearsal

Experience replay for continual learning , (2019) by Rolnick, David, Ahuja, Arun, Schwarz, Jonathan, Lillicrap, Timothy and Wayne, Gregory [bib]
Gradient Episodic Memory for Continual Learning , (2017) by Lopez-Paz, David and Ranzato, Marc-Aurelio [bib]

A model that alliviates CF via constrained optimization

icarl: Incremental classifier and representation learning , (2017) by Rebuffi, Sylvestre-Alvise, Kolesnikov, Alexander, Sperl, Georg and Lampert, Christoph H [bib]

Binary cross-entropy loss for representation learning & exemplar memory (or coreset) for replay (Single-head setting)

Generative Replay Methods

Brain-Like Replay For Continual Learning With Artificial Neural Networks , (2020) by van de Ven, Gido M, Siegelmann, Hava T and Tolias, Andreas S [bib]
Learning to remember: A synaptic plasticity driven framework for continual learning , (2019) by Ostapenko, Oleksiy, Puscas, Mihai, Klein, Tassilo, Jahnichen, Patrick and Nabi, Moin [bib]

introdudes Dynamic generative memory (DGM) which relies on conditional generative adversarial networks with learnable connection plasticity realized with neural masking

Generative Models from the perspective of Continual Learning , (2019) by Lesort, Timoth{'e}e, Caselles-Dupr{'e}, Hugo, Garcia-Ortiz, Michael, Goudou, Jean-Fran{\c c}ois and Filliat, David [bib]

Extensive evaluation of CL methods for generative modeling

Marginal replay vs conditional replay for continual learning , (2019) by Lesort, Timoth{'e}e, Gepperth, Alexander, Stoian, Andrei and Filliat, David [bib]

Extensive evaluation of generative replay methods

Generative replay with feedback connections as a general strategy for continual learning , (2018) by Michiel van der Ven and Andreas S. Tolias [bib]

smarter Generative Replay

Continual learning with deep generative replay , (2017) by Shin, Hanul, Lee, Jung Kwon, Kim, Jaehong and Kim, Jiwon [bib]

Introduces generative replay

Dynamic Architectures or Routing Methods

ORACLE: Order Robust Adaptive Continual Learning , (2019) by Jaehong Yoon and Saehoon Kim and Eunho Yang and Sung Ju Hwang [bib]
Random Path Selection for Incremental Learning , (2019) by Jathushan Rajasegaran and Munawar Hayat and Salman H. Khan and Fahad Shahbaz Khan and Ling Shao [bib]

Proposes a random path selection algorithm, called RPSnet, that progressively chooses optimal paths for the new tasks while encouraging parameter sharing and reuse

Incremental Learning through Deep Adaptation , (2018) by Amir Rosenfeld and John K. Tsotsos [bib]
Continual Learning in Practice , (2018) by Diethe, Tom, Borchert, Tom, Thereska, Eno, Pigem, Borja de Balle and Lawrence, Neil [bib]

Proposes a reference architecture for a continual learning system

Progressive Neural Networks , (2016) by {Rusu}, A.~A., {Rabinowitz}, N.~C., {Desjardins}, G., {Soyer}, H., {Kirkpatrick}, J., {Kavukcuoglu}, K., {Pascanu}, R. and {Hadsell}, R. [bib]

Each task have a specific model connected to the previous ones

Hybrid Methods

Continual learning with hypernetworks , (2020) by Johannes von Oswald, Christian Henning, João Sacramento and Benjamin F. Grewe [bib]

Learning task-conditioned hypernetworks for continual learning as well as task embeddings; hypernetwors offers good model compression.

Compacting, Picking and Growing for Unforgetting Continual Learning , (2019) by Hung, Ching-Yi, Tu, Cheng-Hao, Wu, Cheng-En, Chen, Chien-Hung, Chan, Yi-Ming and Chen, Chu-Song [bib]

Approach leverages the principles of deep model compression, critical weights selection, and progressive networks expansion. All enforced in an iterative manner

Continual Few-Shot Learning

Wandering Within a World: Online Contextualized Few-Shot Learning , (2020) by Mengye Ren, Michael L. Iuzzolino, Michael C. Mozer and Richard S. Zemel [bib]

proposes a new continual few-shot setting where spacial and temporal context can be leveraged to and unseen classes need to be predicted

Defining Benchmarks for Continual Few-Shot Learning , (2020) by Antoniou, Antreas, Patacchiola, Massimiliano, Ochal, Mateusz and Storkey, Amos [bib]

(title is a good enough summary)

Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning , (2020) by Caccia, Massimo, Rodriguez, Pau, Ostapenko, Oleksiy, Normandin, Fabrice, Lin, Min, Caccia, Lucas, Laradji, Issam, Rish, Irina, Lacoste, Alexandre, Vazquez, David and others [bib]

Proposes a new approach to CL evaluation more aligned with real-life applications, bringing CL closer to Online Learning and Open-World learning

Learning from the Past: Continual Meta-Learning via Bayesian Graph Modeling , (2019) by Yadan Luo, Zi Huang, Zheng Zhang, Ziwei Wang, Mahsa Baktashmotlagh and Yang Yang [bib]
Online Meta-Learning , (2019) by Finn, Chelsea, Rajeswaran, Aravind, Kakade, Sham and Levine, Sergey [bib]

defines Online Meta-learning; propsoses Follow the Meta Leader (FTML) (~ Online MAML)

Reconciling meta-learning and continual learning with online mixtures of tasks , (2019) by Jerfel, Ghassen, Grant, Erin, Griffiths, Tom and Heller, Katherine A [bib]

Meta-learns a tasks structure; continual adaptation via non-parametric prior

Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL , (2019) by Anusha Nagabandi, Chelsea Finn and Sergey Levine [bib]

Formulates an online learning procedure that uses SGD to update model parameters, and an EM with a Chinese restaurant process prior to develop and maintain a mixture of models to handle non-stationary task distribution

Task Agnostic Continual Learning via Meta Learning , (2019) by Xu He, Jakub Sygnowski, Alexandre Galashov, Andrei A. Rusu, Yee Whye Teh and Razvan Pascanu [bib]

Introduces What & How framework; enables Task Agnostic CL with meta learned task inference

Meta-Continual Learning

La-MAML: Look-ahead Meta Learning for Continual Learning , (2020) by Gunshi Gupta, Karmesh Yadav and Liam Paull [bib]

Proposes an online replay-based meta-continual learning algorithm with learning-rate modulation to mitigate catastrophic forgetting

Learning to Continually Learn , (2020) by Beaulieu, Shawn, Frati, Lapo, Miconi, Thomas, Lehman, Joel, Stanley, Kenneth O, Clune, Jeff and Cheney, Nick [bib]

Follow-up of OML. Meta-learns an activation-gating function instead.

Meta-Learning Representations for Continual Learning , (2019) by Javed, Khurram and White, Martha [bib]

Introduces Learns how to continually learn (OML) i.e. learns how to do online updates without forgetting.

Meta-learnt priors slow down catastrophic forgetting in neural networks , (2019) by Spigler, Giacomo [bib]

Learning MAML in a Meta continual learning way slows down forgetting

Learning to learn without forgetting by maximizing transfer and minimizing interference , (2018) by Riemer, Matthew, Cases, Ignacio, Ajemian, Robert, Liu, Miao, Rish, Irina, Tu, Yuhai and Tesauro, Gerald [bib]

Lifelong Reinforcement Learning

Continual learning for robotics: Definition, framework, learning strategies, opportunities and challenges , (2020) by Timothée Lesort, Vincenzo Lomonaco, Andrei Stoian, Davide Maltoni, David Filliat and Natalia Díaz-Rodríguez [bib]
Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL , (2019) by Anusha Nagabandi, Chelsea Finn and Sergey Levine [bib]

Formulates an online learning procedure that uses SGD to update model parameters, and an EM with a Chinese restaurant process prior to develop and maintain a mixture of models to handle non-stationary task distribution

Experience replay for continual learning , (2019) by Rolnick, David, Ahuja, Arun, Schwarz, Jonathan, Lillicrap, Timothy and Wayne, Gregory [bib]

Continual Generative Modeling

Continual Unsupervised Representation Learning , (2019) by Dushyant Rao, Francesco Visin, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu and Raia Hadsell [bib]

Introduces unsupervised continual learning (no task label and no task boundaries)

Generative Models from the perspective of Continual Learning , (2019) by Lesort, Timoth{'e}e, Caselles-Dupr{'e}, Hugo, Garcia-Ortiz, Michael, Goudou, Jean-Fran{\c c}ois and Filliat, David [bib]

Extensive evaluation of CL methods for generative modeling

Lifelong Generative Modeling , (2017) by Ramapuram, Jason, Gregorova, Magda and Kalousis, Alexandros [bib]

Applications

CLOPS: Continual Learning of Physiological Signals , (2020) by Kiyasseh, Dani, Zhu, Tingting and Clifton, David A [bib]

a healthcare-specific replay-based method to mitigate destructive interference during continual learning

LAMAL: LAnguage Modeling Is All You Need for Lifelong Language Learning , (2020) by Fan-Keng Sun, Cheng-Hao Ho and Hung-Yi Lee [bib]
Compositional Language Continual Learning , (2020) by Yuanpeng Li, Liang Zhao, Kenneth Church and Mohamed Elhoseiny [bib]

method for compositional continual learning of sequence-to-sequence models

Unsupervised real-time anomaly detection for streaming data , (2017) by Ahmad, Subutai, Lavin, Alexander, Purdy, Scott and Agha, Zuha [bib]

HTM applied to real-world anomaly detection problem

Continuous online sequence learning with an unsupervised neural network model , (2016) by Cui, Yuwei, Ahmad, Subutai and Hawkins, Jeff [bib]

HTM applied to a prediction problem of taxi passenger demand

Thesis

Continual Learning: Tackling Catastrophic Forgetting in Deep Neural Networks with Replay Processes , (2020) by Timothée Lesort [bib]
Continual Learning with Deep Architectures , (2019) by Vincenzo Lomonaco [bib]
Continual Learning in Neural Networks , (2019) by Aljundi, Rahaf [bib]
Continual learning in reinforcement environments , (1994) by Ring, Mark Bishop [bib]

Workshops

Workshop on Continual Learning at ICML 2020 , (2020) by Rahaf Aljundi, Haytham Fayek, Eugene Belilovsky, David Lopez-Paz, Arslan Chaudhry, Marc Pickett, Puneet Dokania, Jonathan Schwarz, Sayna Ebrahimi [bib]
4th Lifelong Machine Learning Workshop at ICML 2020 , (2020) by Shagun Sodhani, Sarath Chandar, Balaraman Ravindran and Doina Precup [bib]

Name		Name	Last commit message	Last commit date
Latest commit History 308 Commits
reading_group		reading_group
scripts		scripts
Others.md		Others.md
README.md		README.md
bibtex.bib		bibtex.bib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Continual Learning Literature

Outline

Classics

Surveys

Influentials

New Settings or Metrics

Regularization Methods

Distillation Methods

Rehearsal Methods

Generative Replay Methods

Dynamic Architectures or Routing Methods

Hybrid Methods

Continual Few-Shot Learning

Meta-Continual Learning

Lifelong Reinforcement Learning

Continual Generative Modeling

Applications

Thesis

Workshops

About

Releases

Packages

Languages

pythonfirst/continual_learning_papers

Folders and files

Latest commit

History

Repository files navigation

Continual Learning Literature

Outline

Classics

Surveys

Influentials

New Settings or Metrics

Regularization Methods

Distillation Methods

Rehearsal Methods

Generative Replay Methods

Dynamic Architectures or Routing Methods

Hybrid Methods

Continual Few-Shot Learning

Meta-Continual Learning

Lifelong Reinforcement Learning

Continual Generative Modeling

Applications

Thesis

Workshops

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages