Summary of Model Stock: All We Need Is Just a Few Fine-tuned Models, by Dong-hwan Jang et al.
Model Stock: All we need is just a few fine-tuned models
by Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han
First submitted to arxiv on: 28 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel approach to fine-tuning large pre-trained models, achieving strong performance both in-distribution (ID) and out-of-distribution (OOD). Unlike traditional methods requiring multiple fine-tuned models for averaging, this method employs significantly fewer models to achieve superior accuracy. The authors uncover a link between performance and proximity to the center of weight space, introducing a method that approximates a center-close weight using only two fine-tuned models. This layer-wise weight averaging technique, coined Model Stock, surpasses state-of-the-art model methods like Model Soup, utilizing only two fine-tuned models. The authors demonstrate the efficacy of Model Stock on standard benchmarks, achieving remarkable performance on both ID and OOD tasks with pre-trained CLIP architectures, while introducing minimal extra computational demands. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a new way to make big AI models work better. Instead of needing many small tweaks to get the model right, this method makes only a few changes to achieve great results. The authors found that if they make these small changes in just the right way, it can be very effective. They call this approach “Model Stock” and show how it works well on big tasks like recognizing images or understanding text. This new method is simple and doesn’t require much extra computer power, making it a useful tool for AI researchers. |
Keywords
* Artificial intelligence * Fine tuning