Summary of A Federated Learning Benchmark on Tabular Data: Comparing Tree-based Models and Neural Networks, by William Lindskog and Christian Prehofer
A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks
by William Lindskog, Christian Prehofer
First submitted to arxiv on: 3 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Federated Learning (FL) has emerged as a promising approach for training machine learning models on distributed datasets. Initially designed for Deep Neural Networks (DNNs), FL has shown success on image and text tasks. However, its application to tabular data remains underexplored. Tree-Based Models (TBMs) have been found to excel on tabular data and are now being integrated with FL. This study benchmarks federated TBMs and DNNs for horizontal FL, comparing their performance on 10 well-known tabular datasets with varying data partitions. The results indicate that current federated boosted TBMs outperform federated DNNs across different data partitions. Notably, a federated XGBoost model surpasses all other models tested. Moreover, the study shows that federated TBMs generally outperform parametric models, even when the number of clients is significantly increased. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine a way to train artificial intelligence models using data from many different sources, without sharing the individual data itself. This concept is called Federated Learning (FL). Right now, FL works best with images and text, but what about tables of numbers? Researchers have been experimenting with Tree-Based Models (TBMs) on table data and found that they work really well when used with FL. In this study, scientists compared different types of models using FL on 10 big datasets to see which ones perform the best. The results show that a special type of TBM called XGBoost is the champion, followed closely by other TBMs. This means that using TBMs with FL can help create better AI models for working with table data. |
Keywords
» Artificial intelligence » Federated learning » Machine learning » Xgboost