Loading Now

Summary of Outlier Detection Bias Busted: Understanding Sources Of Algorithmic Bias Through Data-centric Factors, by Xueying Ding et al.


Outlier Detection Bias Busted: Understanding Sources of Algorithmic Bias through Data-centric Factors

by Xueying Ding, Rui Xi, Leman Akoglu

First submitted to arxiv on: 24 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computers and Society (cs.CY)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed work sheds light on the possible sources of unfairness in unsupervised outlier detection (OD) by auditing detection models under different data-centric factors. The study injects various known biases into input data, including sample size disparity, under-representation, feature measurement noise, and group membership obfuscation. It finds that OD algorithms exhibit fairness pitfalls, differing in the types of data bias they are more susceptible to. Notably, the study demonstrates that OD algorithm bias is not merely a data bias problem, but rather can be influenced by natural or organic data properties such as sparsity, base rate, variance, and multi-modality. The proposed work has implications for developing fairness-enhanced OD algorithms in applications like finance and security.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research looks at how machine learning (ML) models can be unfair when they’re used in real-world situations. Right now, most studies on fairness focus on supervised ML, but unsupervised outlier detection (OD) is important too. The proposed work tries to figure out why OD algorithms are unfair by looking at different kinds of data biases. It finds that all the OD algorithms it studied have fairness problems, and they’re affected differently by different types of bias. The study shows that the problem isn’t just with the algorithm itself, but also with the way the data is structured or “organic”. This matters because we need to make sure ML models are fair when they’re used in important areas like finance and security.

Keywords

» Artificial intelligence  » Machine learning  » Outlier detection  » Supervised  » Unsupervised