Summary of Cross-table Pretraining Towards a Universal Function Space For Heterogeneous Tabular Data, by Jintai Chen et al.
Cross-Table Pretraining towards a Universal Function Space for Heterogeneous Tabular Data
by Jintai Chen, Zhen Lin, Qiyuan Chen, Jimeng Sun
First submitted to arxiv on: 1 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper introduces XTFormer, a cross-table pretrained Transformer for versatile downstream tabular prediction tasks. The authors address the limitations of cross-dataset pretraining when applied to tabular data by establishing a “meta-function” space that encompasses all potential feature-target mappings. In pre-training, a variety of potential mappings are extracted from pre-training tabular datasets and embedded into this space. The XTFormer is then fine-tuned for downstream tasks using a coordinate positioning approach. Experimental results show that XTFormer outperforms XGBoost, Catboost, FT-Transformer, and XTab on 137 (72%), 144 (76%), and 162 (85%) tabular prediction tasks, respectively. This work demonstrates the effectiveness of pre-training transformers for tabular data prediction. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research is about a new way to use artificial intelligence (AI) models to predict patterns in tables of numbers. Right now, AI can learn from big datasets and then apply that knowledge to similar tasks. But when it comes to predicting patterns in different types of tables, the results are not very good. The authors of this study have developed a new approach called XTFormer that can learn from many different kinds of tables and then use that knowledge to make better predictions on new, unseen data. In experiments, XTFormer was found to be more accurate than other AI models in predicting patterns in 72% of the tasks tested. |
Keywords
» Artificial intelligence » Pretraining » Transformer » Xgboost