Summary of A Federated Learning Benchmark on Tabular Data: Comparing Tree-based Models and Neural Networks, by William Lindskog and Christian Prehofer

A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks

by William Lindskog, Christian Prehofer

First submitted to arxiv on: 3 May 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Federated Learning (FL) has emerged as a promising approach for training machine learning models on distributed datasets. Initially designed for Deep Neural Networks (DNNs), FL has shown success on image and text tasks. However, its application to tabular data remains underexplored. Tree-Based Models (TBMs) have been found to excel on tabular data and are now being integrated with FL. This study benchmarks federated TBMs and DNNs for horizontal FL, comparing their performance on 10 well-known tabular datasets with varying data partitions. The results indicate that current federated boosted TBMs outperform federated DNNs across different data partitions. Notably, a federated XGBoost model surpasses all other models tested. Moreover, the study shows that federated TBMs generally outperform parametric models, even when the number of clients is significantly increased.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Imagine a way to train artificial intelligence models using data from many different sources, without sharing the individual data itself. This concept is called Federated Learning (FL). Right now, FL works best with images and text, but what about tables of numbers? Researchers have been experimenting with Tree-Based Models (TBMs) on table data and found that they work really well when used with FL. In this study, scientists compared different types of models using FL on 10 big datasets to see which ones perform the best. The results show that a special type of TBM called XGBoost is the champion, followed closely by other TBMs. This means that using TBMs with FL can help create better AI models for working with table data.

Keywords

» Artificial intelligence » Federated learning » Machine learning » Xgboost

A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks

by William Lindskog, Christian Prehofer

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Few-sample Variational Inference Of Bayesian Neural Networks with Arbitrary Nonlinearities, by David J. Schodt

Summary of Subgraph2vec: a Random Walk-based Algorithm For Embedding Knowledge Graphs, by Elika Bozorgi et al.

Related Posts