Loading Now

Summary of Towards Measuring Fairness in Speech Recognition: Fair-speech Dataset, by Irina-elena Veliche et al.


Towards measuring fairness in speech recognition: Fair-Speech dataset

by Irina-Elena Veliche, Zhuangqun Huang, Vineeth Ayyat Kochaniyan, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer

First submitted to arxiv on: 22 Aug 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computers and Society (cs.CY); Sound (cs.SD); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper addresses a crucial gap in the current public datasets for automatic speech recognition (ASR), which often overlook fairness aspects such as performance disparities across different demographic groups. To bridge this gap, researchers introduce Fair-Speech, a novel publicly released corpus designed to evaluate ASR models’ accuracy across diverse demographics, including age, gender, ethnicity, geographic variation, and native English speaker status. The dataset comprises approximately 26.5K utterances recorded by 593 individuals in the United States, who were compensated for recording and submitting audio clips of themselves saying voice commands. Additionally, the paper provides ASR baselines based on models trained on transcribed and untranscribed social media videos as well as open-source models.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper creates a new dataset to help machines better understand people’s voices. Right now, many public datasets for speech recognition don’t consider fairness, like how well machines perform for different groups of people. The researchers want to change this by introducing the Fair-Speech dataset, which has over 26,000 recordings from 593 people in the United States. These recordings are diverse and include people of different ages, genders, ethnicities, geographic locations, and native English speaker status. The goal is to help machines better recognize voices across these groups.

Keywords

* Artificial intelligence