Summary of Optimal Transport For Fairness: Archival Data Repair Using Small Research Data Sets, by Abigail Langbridge and Anthony Quinn and Robert Shorten

Optimal Transport for Fairness: Archival Data Repair using Small Research Data Sets

by Abigail Langbridge, Anthony Quinn, Robert Shorten

First submitted to arxiv on: 20 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The authors propose an algorithm to repair unfairness in training data, specifically addressing the need for archival data repair. They define fairness in terms of conditional independence between protected attributes and features, given unprotected attributes. The approach uses optimal transport (OT)-based repair plans on interpolated supports, allowing off-sample, labelled archival data to be repaired subject to stationarity assumptions. Experimental results demonstrate effective repair of large quantities of off-sample, labelled data using simulated and real-world datasets such as Adult. This work is particularly relevant in light of the AI Act and other regulations emphasizing fairness in machine learning.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper aims to fix unfairness in training data using a new method called optimal transport (OT). It’s like fixing a mistake in a big library where some books are unfairly labeled. The authors define what fairness means mathematically and then create a way to repair the mistakes using only a small part of the data that is already labeled correctly. This makes it faster and cheaper to fix many more books (data) without making any new ones. The results show that this method works well for fixing large amounts of data, including real-world datasets like Adult.

Keywords

* Artificial intelligence * Machine learning

Optimal Transport for Fairness: Archival Data Repair using Small Research Data Sets

by Abigail Langbridge, Anthony Quinn, Robert Shorten

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Diffimpute: Tabular Data Imputation with Denoising Diffusion Probabilistic Model, by Yizhu Wen et al.

Summary of Capsule Neural Networks As Noise Stabilizer For Time Series Data, by Soyeon Kim et al.

Related Posts