Summary of Latable: Towards Large Tabular Models, by Boris Van Breugel et al.
LaTable: Towards Large Tabular Models
by Boris van Breugel, Jonathan Crabbé, Rob Davis, Mihaela van der Schaar
First submitted to arxiv on: 25 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel tabular diffusion model, LaTable, is proposed to address challenges in creating generative tabular foundation models. The heterogeneous feature spaces of different datasets, metadata, and tables lacking prior knowledge make it hard to develop such a model. LaTable can be trained across various datasets and outperforms baselines on in-distribution generation, while finetuning enables better out-of-distribution dataset generation with fewer samples. However, the zero-shot performance of LaTable is poor, offering insights for building models with improved zero- and few-shot generation capabilities. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary LaTable is a new way to make tables that look like real ones. Right now, we don’t have many ways to do this because tables can be very different from each other. This makes it hard to create a model that can generate realistic tables. LaTable tries to fix this problem by being able to learn from many different types of tables. It works pretty well when it’s trained on the same type of data, but not so great when it has to make up new data without any examples. This is important because we need better ways to generate data that looks like real data. |
Keywords
» Artificial intelligence » Diffusion model » Few shot » Zero shot