Loading Now

Summary of Tabular Data Contrastive Learning Via Class-conditioned and Feature-correlation Based Augmentation, by Wei Cui and Rasa Hosseinzadeh and Junwei Ma and Tongzi Wu and Yi Sui and Keyvan Golestan


Tabular Data Contrastive Learning via Class-Conditioned and Feature-Correlation Based Augmentation

by Wei Cui, Rasa Hosseinzadeh, Junwei Ma, Tongzi Wu, Yi Sui, Keyvan Golestan

First submitted to arxiv on: 26 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this paper, the authors propose a novel approach to contrastive learning for tabular data, building upon the success of domain-specific augmentation techniques in image and natural language domains. The existing method of corrupting tabular entries via swapping values is not as effective, so they introduce a simple yet powerful improvement: corrupting tabular data conditioned on class identity. Specifically, when corrupting a specific tabular entry from an anchor row, they sample values only from rows that are identified to be within the same class as the anchor row. They also explore the idea of selecting features to be corrupted based on feature correlation structures. The authors conduct extensive experiments and show that their proposed approach consistently outperforms the conventional corruption method for tabular data classification tasks.
Low GrooveSquid.com (original content) Low Difficulty Summary
Contrastive learning is a way to make models better at recognizing patterns in data by creating multiple versions of the same information. This has worked well for pictures and words, but not as much for tables of numbers. The authors think that if they corrupt (or change) table entries based on what class it belongs to, it could be more helpful. They also try changing features based on how correlated they are with other features. In experiments, their method works better than the old way for classifying tabular data.

Keywords

» Artificial intelligence  » Classification