Loading Now

Summary of A Federated Learning Benchmark on Tabular Data: Comparing Tree-based Models and Neural Networks, by William Lindskog and Christian Prehofer


A Federated Learning Benchmark on Tabular Data: Comparing Tree-Based Models and Neural Networks

by William Lindskog, Christian Prehofer

First submitted to arxiv on: 3 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Federated Learning (FL) has emerged as a promising approach for training machine learning models on distributed datasets. Initially designed for Deep Neural Networks (DNNs), FL has shown success on image and text tasks. However, its application to tabular data remains underexplored. Tree-Based Models (TBMs) have been found to excel on tabular data and are now being integrated with FL. This study benchmarks federated TBMs and DNNs for horizontal FL, comparing their performance on 10 well-known tabular datasets with varying data partitions. The results indicate that current federated boosted TBMs outperform federated DNNs across different data partitions. Notably, a federated XGBoost model surpasses all other models tested. Moreover, the study shows that federated TBMs generally outperform parametric models, even when the number of clients is significantly increased.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine a way to train artificial intelligence models using data from many different sources, without sharing the individual data itself. This concept is called Federated Learning (FL). Right now, FL works best with images and text, but what about tables of numbers? Researchers have been experimenting with Tree-Based Models (TBMs) on table data and found that they work really well when used with FL. In this study, scientists compared different types of models using FL on 10 big datasets to see which ones perform the best. The results show that a special type of TBM called XGBoost is the champion, followed closely by other TBMs. This means that using TBMs with FL can help create better AI models for working with table data.

Keywords

» Artificial intelligence  » Federated learning  » Machine learning  » Xgboost