Summary of On the Role Of Speech Data in Reducing Toxicity Detection Bias, by Samuel J. Bell et al.

On the Role of Speech Data in Reducing Toxicity Detection Bias

by Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà

First submitted to arxiv on: 12 Nov 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The abstract presents a study on the biases in text-based toxicity detection systems, which produce high rates of false positives when mentioning demographic groups. The researchers investigate whether speech-based systems can mitigate these biases by comparing speech- and text-based toxicity classifiers on a multilingual MuTox dataset. They find that access to speech data during inference reduces bias against group mentions, particularly for ambiguous and disagreement-inducing samples. Additionally, the study suggests that improving classifiers rather than transcription pipelines is more effective in reducing group bias.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper looks at how well systems can detect toxic language in text and speech. Right now, these systems are biased, which means they incorrectly flag certain groups of people more often than others. The researchers want to see if using audio recordings instead of just written text helps reduce this bias. They tested several different types of classifiers on a dataset called MuTox, and found that when the system gets to listen to the recording, it’s less likely to make false accusations against certain groups. This is especially true for tricky cases where people might disagree or have mixed feelings. Overall, the study suggests that making the classifier better is more important than improving how we transcribe spoken words.

Keywords

* Artificial intelligence * Inference

On the Role of Speech Data in Reducing Toxicity Detection Bias

by Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Impactful Bit-flip Search on Full-precision Models, by Nadav Benedek et al.

Summary of Eapcr: a Universal Feature Extractor For Scientific Data Without Explicit Feature Relation Patterns, by Zhuohang Yu et al.

Related Posts