Loading Now

Summary of Soft-transformers For Continual Learning, by Haeyong Kang et al.


Soft-TransFormers for Continual Learning

by Haeyong Kang, Chang D. Yoo

First submitted to arxiv on: 25 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Soft-TransFormers (Soft-TF) method is a fully fine-tuned continual learning (CL) approach inspired by the Well-initialized Lottery Ticket Hypothesis. It sequentially learns and selects an optimal soft-network or subnetwork for each task, jointly optimizing sparse layer weights to obtain task-adaptive networks while keeping well-pre-trained layers frozen. Soft-TF masks pre-trained network parameters in inference, preserving knowledge and minimizing catastrophic forgetting (CF). The method achieves state-of-the-art performance in various CL scenarios, including class-incremental learning (CIL) and task-incremental learning (TIL), supported by convergence theory.
Low GrooveSquid.com (original content) Low Difficulty Summary
Soft-TransFormers is a new way for computers to learn new tasks without forgetting old ones. Imagine a computer has learned how to recognize different animals, and then it needs to learn about new types of fish. Normally, this would cause the computer to forget some of its original animal recognition skills. Soft-TF helps prevent this by creating a special network that adapts to each new task while keeping the old knowledge intact. This approach is tested on two popular AI models (ViT and CLIP) and outperforms previous methods in various learning scenarios.

Keywords

* Artificial intelligence  * Continual learning  * Inference  * Vit