Loading Now

Summary of Efficient Training Of Large Vision Models Via Advanced Automated Progressive Learning, by Changlin Li et al.


Efficient Training of Large Vision Models via Advanced Automated Progressive Learning

by Changlin Li, Jiawei Zhang, Sihao Lin, Zongxin Yang, Junwei Liang, Xiaodan Liang, Xiaojun Chang

First submitted to arxiv on: 6 Sep 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper presents an advanced automated progressive learning (AutoProg) framework for efficient training of Large Vision Models (LVMs), specifically focusing on Vision Transformers (ViTs) and diffusion models. The authors develop AutoProg-One, featuring momentum growth (MoGrow) and a one-shot growth schedule search, which accelerates ViT pre-training by up to 1.85x on ImageNet. They also introduce AutoProg-Zero, enhancing the framework with a novel zero-shot unfreezing schedule search, eliminating the need for one-shot supernet training. Additionally, they propose a Unique Stage Identifier (SID) scheme to bridge the gap during network growth. The authors demonstrate that AutoProg accelerates fine-tuning of diffusion models by up to 2.86x, with comparable or even higher performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper presents an innovative approach to efficiently train Large Vision Models (LVMs). By using progressive learning strategies, researchers can reduce the computational resources needed for training these powerful AI models. The authors develop a new framework called AutoProg that makes this process more efficient and effective. They show that their approach can speed up pre-training by 85% and fine-tuning by 86%, without sacrificing performance.

Keywords

» Artificial intelligence  » Diffusion  » Fine tuning  » One shot  » Vit  » Zero shot