Summary of Data Vs. Model Machine Learning Fairness Testing: An Empirical Study, by Arumoy Shome and Luis Cruz and Arie Van Deursen

Data vs. Model Machine Learning Fairness Testing: An Empirical Study

by Arumoy Shome, Luis Cruz, Arie van Deursen

First submitted to arxiv on: 15 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper takes a crucial step forward by evaluating the fairness of Machine Learning (ML) models not just after training, but also before. The researchers test their approach using four ML algorithms, five real-world datasets, and 1,600 fairness evaluation cycles. They find that there’s a linear relationship between data and model fairness metrics when the distribution or size of the training data changes. This means that detecting biases in data collection early on can be an efficient way to prevent biased models from being trained.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Machine learning is like teaching a computer new skills! But sometimes, these computers can be unfair or biased. In this paper, scientists are trying to figure out how to make sure the computers don’t learn bad habits. They’re looking at two ways to measure fairness: one for when the data is collected and another for after the training is done. They used different types of computer algorithms, real-life datasets, and did many tests (1,600!) to see if they could spot problems early on. What they found was that it’s possible to catch biases in the data collection process before even starting to train the computer! This can help reduce development time and costs.

Keywords

* Artificial intelligence * Machine learning

Data vs. Model Machine Learning Fairness Testing: An Empirical Study

by Arumoy Shome, Luis Cruz, Arie van Deursen

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Empirical Evidence For the Fragment Level Understanding on Drug Molecular Structure Of Llms, by Xiuyuan Hu et al.

Summary of Go-explore For Residential Energy Management, by Junlin Lu et al.

Related Posts