Summary of Dynamic Language Group-based Moe: Enhancing Code-switching Speech Recognition with Hierarchical Routing, by Hukai Huang et al.
Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing
by Hukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li
First submitted to arxiv on: 26 Jul 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed Dynamic Language Group-based Mixture of Experts (DLG-MoE) model leverages the advantages of parameter scaling in handling code-switching speech recognition (CS-ASR) tasks. This MoE model operates based on a hierarchical routing mechanism, which explicitly models language attributes and dispatches representations to corresponding language expert groups. The unsupervised router within each group implicitly models attributes beyond language, coordinating expert routing and collaboration. DLG-MoE outperforms existing MoE methods on CS-ASR tasks while showcasing flexibility in supporting different top-k inference, streaming capabilities, and parameter pruning to obtain a monolingual sub-model. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The Mixture of Experts (MoE) model is used for speech recognition. This paper makes the MoE better by adding something called Dynamic Language Group-based MoE or DLG-MoE. It’s like a special filter that helps the model understand different languages and work together. The new model does a better job at recognizing spoken language when people switch between languages. It can also be adjusted to do certain tasks, like only understanding one language. |
Keywords
» Artificial intelligence » Inference » Mixture of experts » Pruning » Unsupervised