Summary of Loqt: Low-rank Adapters For Quantized Pretraining, by Sebastian Loeschcke et al.
LoQT: Low-Rank Adapters for Quantized Pretraining
by Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie, Vésteinn Snæbjarnarson
First submitted to arxiv on: 26 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed Low-Rank Adapters for Quantized Training (LoQT) method enables efficient training of large models on consumer hardware, overcoming limitations imposed by low-rank adapters and quantization. By leveraging gradient-based tensor factorization to initialize low-rank trainable weight matrices that are periodically merged into quantized full-rank weight matrices, LoQT achieves significant speedups in model training for both pretraining and fine-tuning tasks. Empirical results demonstrate the effectiveness of LoQT in language modeling and downstream task adaptation, successfully training models up to 7B parameters on a 24GB GPU and even larger models (13B) using per-layer gradient updates. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary LoQT is a new way to train big computers that can understand lots of words. Right now, it’s hard to make these computers learn without special tricks like breaking them into smaller pieces or using tiny steps. LoQT fixes this by using a clever math trick to start with small pieces and then put them together. This helps the computer learn much faster! Scientists tested LoQT on a big job called language modeling and found that it worked really well, even for computers that can understand millions of words. |
Keywords
» Artificial intelligence » Fine tuning » Pretraining » Quantization