continual learning tag

Gwern Branwen

See Also
Links
Miscellaneous
Link Bibliography

[Warning: JavaScript Disabled!]

[For support of key website features (link annotation popups/popovers & transclusions, collapsible sections, backlinks, tablesorting, image zooming, sidenotes etc), you must enable JavaScript.]

Links

“Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data”, Gerstgrasser et al 2024

Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data

“Simple and Scalable Strategies to Continually Pre-Train Large Language Models”, Ibrahim et al 2024

Simple and Scalable Strategies to Continually Pre-train Large Language Models

“Online Adaptation of Language Models With a Memory of Amortized Contexts (MAC)”, Tack et al 2024

Online Adaptation of Language Models with a Memory of Amortized Contexts (MAC)

“RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Balaguer et al 2024

RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture

“LLaMA Pro: Progressive LLaMA With Block Expansion”, Wu et al 2024

LLaMA Pro: Progressive LLaMA with Block Expansion

“Loss of Plasticity in Deep Continual Learning”, Dohare et al 2023

Loss of Plasticity in Deep Continual Learning

“Continual Diffusion: Continual Customization of Text-To-Image Diffusion With C-LoRA”, Smith et al 2023

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA

“Understanding Plasticity in Neural Networks”, Lyle et al 2023

Understanding plasticity in neural networks

“Broken Neural Scaling Laws”, Caballero et al 2022

Broken Neural Scaling Laws

“Exclusive Supermask Subnetwork Training for Continual Learning”, Yadav & Bansal 2022

Exclusive Supermask Subnetwork Training for Continual Learning

“On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)”, Rohanian et al 2022

On the Effectiveness of Compact Biomedical Transformers (✱BioBERT)

“Don’t Stop Learning: Towards Continual Learning for the CLIP Model”, Ding et al 2022

Don’t Stop Learning: Towards Continual Learning for the CLIP Model

“Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Hoque et al 2022

Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision

“Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)”, Caccia et al 2022

Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline (3RL)

“CT0: Fine-Tuned Language Models Are Continual Learners”, Scialom et al 2022

CT0: Fine-tuned Language Models are Continual Learners

“Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models”, Tirumala et al 2022

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

“Continual Pre-Training Mitigates Forgetting in Language and Vision”, Cossu et al 2022

Continual Pre-Training Mitigates Forgetting in Language and Vision

“Continual Learning With Foundation Models: An Empirical Study of Latent Replay”, Ostapenko et al 2022

Continual Learning with Foundation Models: An Empirical Study of Latent Replay

“DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning”, Wang et al 2022

DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning

“Effect of Scale on Catastrophic Forgetting in Neural Networks”, Ramasesh et al 2022

Effect of scale on catastrophic forgetting in neural networks

“The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention”, Irie et al 2022

The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns via Spotlights of Attention

“Learning to Prompt for Continual Learning”, Wang et al 2021

Learning to Prompt for Continual Learning

“An Empirical Investigation of the Role of Pre-Training in Lifelong Learning”, Mehta et al 2021

An Empirical Investigation of the Role of Pre-training in Lifelong Learning

“The Geometry of Representational Drift in Natural and Artificial Neural Networks”, Aitken et al 2021

The Geometry of Representational Drift in Natural and Artificial Neural Networks

“Wide Neural Networks Forget Less Catastrophically”, Mirzadeh et al 2021

Wide Neural Networks Forget Less Catastrophically

“Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora”, Jin et al 2021

Lifelong Pretraining: Continually Adapting Language Models to Emerging Corpora

“Continuous Coordination As a Realistic Scenario for Lifelong Learning”, Nekoei et al 2021

Continuous Coordination As a Realistic Scenario for Lifelong Learning

“Learning from the Past: Meta-Continual Learning With Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition”, Zheng et al 2020b

Learning from the Past: Meta-Continual Learning with Knowledge Embedding for Jointly Sketch, Cartoon, and Caricature Face Recognition

“Meta-Learning through Hebbian Plasticity in Random Networks”, Najarro & Risi 2020

Meta-Learning through Hebbian Plasticity in Random Networks

“Learning to Learn With Feedback and Local Plasticity”, Lindsey & Litwin-Kumar 2020

Learning to Learn with Feedback and Local Plasticity

“Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks”, Gururangan et al 2020

Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks

“Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning”, Julian et al 2020

Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning

“On Warm-Starting Neural Network Training”, Ash & Adams 2019

On Warm-Starting Neural Network Training

“Gated Linear Networks”, Veness et al 2019

Gated Linear Networks

“Self-Net: Lifelong Learning via Continual Self-Modeling”, Camp et al 2018

Self-Net: Lifelong Learning via Continual Self-Modeling

“Unicorn: Continual Learning With a Universal, Off-Policy Agent”, Mankowitz et al 2018

Unicorn: Continual Learning with a Universal, Off-policy Agent

“Meta Networks”, Munkhdalai & Yu 2017

Meta Networks

“PathNet: Evolution Channels Gradient Descent in Super Neural Networks”, Fernando et al 2017

PathNet: Evolution Channels Gradient Descent in Super Neural Networks

“Overcoming Catastrophic Forgetting in Neural Networks”, Kirkpatrick et al 2016

Overcoming catastrophic forgetting in neural networks

Sort By Magic

Annotations sorted by machine learning into inferred 'tags'. This provides an alternative way to browse: instead of by date order, one can browse in topic order. The 'sorted' list has been automatically clustered into multiple sections & auto-labeled for easier browsing.

Beginning with the newest annotation, it uses the embedding of each annotation to attempt to create a list of nearest-neighbor annotations, creating a progression of topics. For more details, see the link.

Miscellaneous

Link Bibliography

https://arxiv.org/abs/2401.08406#microsoft: “RAG vs Fine-Tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture”, Angels Balaguer, Vinamra Benara, Renato Luiz de Freitas Cunha, Roberto de M. Estevão Filho, Todd Hendry

link-bibliography
https://arxiv.org/abs/2206.14349: “Fleet-DAgger: Interactive Robot Fleet Learning With Scalable Human Supervision”, Ryan Hoque, Lawrence Yunliang Chen, Satvik Sharma, Karthik Dharmarajan, Brijen Thananjeyan, Pieter Abbeel

link-bibliography
https://arxiv.org/abs/2205.12393: “CT0: Fine-Tuned Language Models Are Continual Learners”, Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan

link-bibliography
https://arxiv.org/abs/2110.11526#deepmind: “Wide Neural Networks Forget Less Catastrophically”, Seyed Iman Mirzadeh, Arslan Chaudhry, Dong Yin, Huiyi Hu, Razvan Pascanu, Dilan Gorur, Mehrdad Farajtabar

link-bibliography

[Quote Of The Day]

[Site Of The Day]

[Annotation Of The Day]

[adblock public service announcement]