Loading Now

Summary of Cross-table Pretraining Towards a Universal Function Space For Heterogeneous Tabular Data, by Jintai Chen et al.


Cross-Table Pretraining towards a Universal Function Space for Heterogeneous Tabular Data

by Jintai Chen, Zhen Lin, Qiyuan Chen, Jimeng Sun

First submitted to arxiv on: 1 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper introduces XTFormer, a cross-table pretrained Transformer for versatile downstream tabular prediction tasks. The authors address the limitations of cross-dataset pretraining when applied to tabular data by establishing a “meta-function” space that encompasses all potential feature-target mappings. In pre-training, a variety of potential mappings are extracted from pre-training tabular datasets and embedded into this space. The XTFormer is then fine-tuned for downstream tasks using a coordinate positioning approach. Experimental results show that XTFormer outperforms XGBoost, Catboost, FT-Transformer, and XTab on 137 (72%), 144 (76%), and 162 (85%) tabular prediction tasks, respectively. This work demonstrates the effectiveness of pre-training transformers for tabular data prediction.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research is about a new way to use artificial intelligence (AI) models to predict patterns in tables of numbers. Right now, AI can learn from big datasets and then apply that knowledge to similar tasks. But when it comes to predicting patterns in different types of tables, the results are not very good. The authors of this study have developed a new approach called XTFormer that can learn from many different kinds of tables and then use that knowledge to make better predictions on new, unseen data. In experiments, XTFormer was found to be more accurate than other AI models in predicting patterns in 72% of the tasks tested.

Keywords

» Artificial intelligence  » Pretraining  » Transformer  » Xgboost