Summary of Zeroth-order Fine-tuning Of Llms with Extreme Sparsity, by Wentao Guo et al.
Zeroth-Order Fine-Tuning of LLMs with Extreme Sparsity
by Wentao Guo, Jikai Long, Yimeng Zeng, Zirui Liu, Xinyu Yang, Yide Ran, Jacob R. Gardner, Osbert Bastani, Christopher De Sa, Xiaodong Yu, Beidi Chen, Zhaozhuo Xu
First submitted to arxiv on: 5 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a method to optimize large language models (LLMs) for fine-tuning in memory-constrained settings using zeroth-order optimization (ZO). ZO is typically used for forward passes, but the authors integrate sparsity and quantization to make it feasible. The key innovation is identifying a small subset of “sensitive parameters” that can be tuned using ZO, allowing most other parameters to remain quantized. The results show that fine-tuning 0.1% sensitive parameters with ZO outperforms full ZO fine-tuning while offering significant speedup. Additionally, the authors demonstrate efficient fine-tuning on a GPU device with limited memory and reduced latency. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about making it possible to use big language models on devices with limited memory, like smartphones or laptops. Right now, these models require too much memory for them to be useful in these settings. The authors developed a new way to fine-tune the models using less memory, which they call “zeroth-order optimization.” They also found that by only adjusting a small part of the model, they can get similar results as if they had adjusted everything. This could make it possible for language models to be used in more places and applications. |
Keywords
» Artificial intelligence » Fine tuning » Optimization » Quantization