Summary of Quantifying Spuriousness Of Biased Datasets Using Partial Information Decomposition, by Barproda Halder et al.

Quantifying Spuriousness of Biased Datasets Using Partial Information Decomposition

by Barproda Halder, Faisal Hamman, Pasan Dissanayake, Qiuyi Zhang, Ilia Sucholutsky, Sanghamitra Dutta

First submitted to arxiv on: 29 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary In this paper, researchers formalize the concept of spurious patterns in datasets using Partial Information Decomposition (PID). They propose a novel metric called unique information, rooted in Blackwell Sufficiency, to quantify dataset spuriousness. The authors demonstrate how higher unique information in spurious features can lead models to prefer those features over core features for inference, resulting in low worst-group-accuracy. To address this issue, they also propose an autoencoder-based estimator for computing unique information and show its effectiveness on high-dimensional image data.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Spurious patterns are fake connections between variables in a dataset that aren’t really related. This paper helps define what these patterns mean mathematically. The authors create a new way to measure how much of this pattern is present in the data, called unique information. They show that when there’s more of this spurious pattern, models might choose the wrong features to predict things, which isn’t good. To fix this, they suggest using special algorithms like autoencoders to calculate this information and make sure it doesn’t happen.

Keywords

* Artificial intelligence * Autoencoder * Inference

Quantifying Spuriousness of Biased Datasets Using Partial Information Decomposition

by Barproda Halder, Faisal Hamman, Pasan Dissanayake, Qiuyi Zhang, Ilia Sucholutsky, Sanghamitra Dutta

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Vcllm: Video Codecs Are Secretly Tensor Codecs, by Ceyu Xu et al.

Summary of Beyond Scaleup: Knowledge-aware Parsimony Learning From Deep Networks, by Quanming Yao and Yongqi Zhang and Yaqing Wang and Nan Yin and James Kwok and Qiang Yang

Related Posts