Summary of An Effective Context-balanced Adaptation Approach For Long-tailed Speech Recognition, by Yi-cheng Wang et al.

An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition

by Yi-Cheng Wang, Li-Ting Pai, Bi-Cheng Yan, Hsin-Wei Wang, Chi-Han Lin, Berlin Chen

First submitted to arxiv on: 10 Sep 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper proposes an extension to contextual adapters (CAs) for end-to-end automatic speech recognition (ASR) models, aiming to improve performance on rare words. CAs infuse external knowledge into E2E ASR models by using a context word list. However, two data imbalance problems remain: overfitting due to low-frequency context words and poor performance on low-frequency context words themselves. The authors investigate the impact of altering the context list’s frequency distribution on model performance and introduce a simple yet effective context-balanced learning objective. Experimental results on the AISHELL-1 benchmark dataset demonstrate a significant reduction in character error rate (CER) by up to 1.21% and an even more pronounced 9.44% reduction in the error rate of zero-shot words.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper is about how computers can better understand spoken language, especially when using rare words. The problem is that current methods are good at understanding common words but struggle with uncommon ones. The authors want to improve this by giving their method more information and making it learn in a way that’s fair for all the different types of words.

Keywords

» Artificial intelligence » Cer » Overfitting » Zero shot

An Effective Context-Balanced Adaptation Approach for Long-Tailed Speech Recognition

by Yi-Cheng Wang, Li-Ting Pai, Bi-Cheng Yan, Hsin-Wei Wang, Chi-Han Lin, Berlin Chen

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Enhancing Long Video Understanding Via Hierarchical Event-based Memory, by Dingxin Cheng et al.

Summary of Native Vs Non-native Language Prompting: a Comparative Analysis, by Mohamed Bayan Kmainasi et al.

Related Posts