Summary of Dynamic Language Group-based Moe: Enhancing Code-switching Speech Recognition with Hierarchical Routing, by Hukai Huang et al.

Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing

by Hukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li

First submitted to arxiv on: 26 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Dynamic Language Group-based Mixture of Experts (DLG-MoE) model leverages the advantages of parameter scaling in handling code-switching speech recognition (CS-ASR) tasks. This MoE model operates based on a hierarchical routing mechanism, which explicitly models language attributes and dispatches representations to corresponding language expert groups. The unsupervised router within each group implicitly models attributes beyond language, coordinating expert routing and collaboration. DLG-MoE outperforms existing MoE methods on CS-ASR tasks while showcasing flexibility in supporting different top-k inference, streaming capabilities, and parameter pruning to obtain a monolingual sub-model.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The Mixture of Experts (MoE) model is used for speech recognition. This paper makes the MoE better by adding something called Dynamic Language Group-based MoE or DLG-MoE. It’s like a special filter that helps the model understand different languages and work together. The new model does a better job at recognizing spoken language when people switch between languages. It can also be adjusted to do certain tasks, like only understanding one language.

Keywords

» Artificial intelligence » Inference » Mixture of experts » Pruning » Unsupervised

Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing

by Hukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Taxonomy-aware Continual Semantic Segmentation in Hyperbolic Spaces For Open-world Perception, by Julia Hindel et al.

Summary of Predicting Winning Captions For Weekly New Yorker Comics, by Stanley Cao et al.

Related Posts