Summary of Photon: Federated Llm Pre-training, by Lorenzo Sani et al.

Photon: Federated LLM Pre-Training

by Lorenzo Sani, Alex Iacob, Zeyu Cao, Royson Lee, Bill Marino, Yan Gao, Dongqi Cai, Zexi Li, Wanru Zhao, Xinchi Qiu, Nicholas D. Lane

First submitted to arxiv on: 5 Nov 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces Photon, a complete system for federated end-to-end large language model (LLM) training that leverages cross-silo federated learning (FL) for global-scale training with minimal communication overheads. The authors show that Photon can train models up to 7B in size in a federated fashion while achieving better perplexity than centralized pre-training. They also demonstrate that Photon’s model training time decreases with available compute, achieving a similar compute-time trade-off to centralized methods. Furthermore, Photon outperforms baseline distributed training methods by 35% through communicating significantly less data.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Federated learning is a way to train artificial intelligence models without sharing personal data from different places. A team of researchers created a new system called Photon that allows many devices with limited computing power to work together and train large language models. This is the first time this has been achieved, and it’s an important step forward in developing more powerful AI models. The authors show that their system can train larger models than before, while also reducing the amount of data needed for training.

Keywords

* Artificial intelligence * Federated learning * Large language model * Perplexity

Photon: Federated LLM Pre-Training

by Lorenzo Sani, Alex Iacob, Zeyu Cao, Royson Lee, Bill Marino, Yan Gao, Dongqi Cai, Zexi Li, Wanru Zhao, Xinchi Qiu, Nicholas D. Lane

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Tokenselect: Efficient Long-context Inference and Length Extrapolation For Llms Via Dynamic Token-level Kv Cache Selection, by Wei Wu et al.

Summary of Gradient Descent Finds Over-parameterized Neural Networks with Sharp Generalization For Nonparametric Regression, by Yingzhen Yang et al.

Related Posts