Loading Now

Summary of Photon: Federated Llm Pre-training, by Lorenzo Sani et al.


Photon: Federated LLM Pre-Training

by Lorenzo Sani, Alex Iacob, Zeyu Cao, Royson Lee, Bill Marino, Yan Gao, Dongqi Cai, Zexi Li, Wanru Zhao, Xinchi Qiu, Nicholas D. Lane

First submitted to arxiv on: 5 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Distributed, Parallel, and Cluster Computing (cs.DC)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper introduces Photon, a complete system for federated end-to-end large language model (LLM) training that leverages cross-silo federated learning (FL) for global-scale training with minimal communication overheads. The authors show that Photon can train models up to 7B in size in a federated fashion while achieving better perplexity than centralized pre-training. They also demonstrate that Photon’s model training time decreases with available compute, achieving a similar compute-time trade-off to centralized methods. Furthermore, Photon outperforms baseline distributed training methods by 35% through communicating significantly less data.
Low GrooveSquid.com (original content) Low Difficulty Summary
Federated learning is a way to train artificial intelligence models without sharing personal data from different places. A team of researchers created a new system called Photon that allows many devices with limited computing power to work together and train large language models. This is the first time this has been achieved, and it’s an important step forward in developing more powerful AI models. The authors show that their system can train larger models than before, while also reducing the amount of data needed for training.

Keywords

» Artificial intelligence  » Federated learning  » Large language model  » Perplexity