Summary of Byzantine-robust and Communication-efficient Distributed Learning Via Compressed Momentum Filtering, by Changxin Liu et al.
Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering
by Changxin Liu, Yanghao Li, Yuhao Yi, Karl H. Johansson
First submitted to arxiv on: 13 Sep 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Distributed, Parallel, and Cluster Computing (cs.DC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel distributed learning method that addresses two critical challenges: Byzantine robustness and communication reduction. The existing methods rely on full gradient information at every iteration or with some probability, which may not converge to the optimal solution efficiently. The proposed method leverages Polyak Momentum to mitigate noise caused by biased compressors and stochastic gradients, defending against Byzantine workers under information compression. The algorithm is proven to have tight complexity bounds for non-convex smooth loss functions, matching the lower bounds in Byzantine-free scenarios. Experimental results on binary classification and image classification tasks demonstrate its practical significance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper makes machine learning training more efficient and secure by creating a new way to share information among different computers. Right now, big models are trained by combining small pieces of work from many machines, but this method can be hacked or slowed down. The researchers came up with a better way that’s faster and more accurate than the current methods. They used something called Polyak Momentum to make sure the information is correct and not affected by bad actors. This new method works well for both simple and complex tasks. |
Keywords
» Artificial intelligence » Classification » Image classification » Machine learning » Probability