Loading Now

Summary of Dynamic Language Group-based Moe: Enhancing Code-switching Speech Recognition with Hierarchical Routing, by Hukai Huang et al.


Dynamic Language Group-Based MoE: Enhancing Code-Switching Speech Recognition with Hierarchical Routing

by Hukai Huang, Shenghui Lu, Yahui Shan, He Qu, Fengrun Zhang, Wenhao Guan, Qingyang Hong, Lin Li

First submitted to arxiv on: 26 Jul 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Dynamic Language Group-based Mixture of Experts (DLG-MoE) model leverages the advantages of parameter scaling in handling code-switching speech recognition (CS-ASR) tasks. This MoE model operates based on a hierarchical routing mechanism, which explicitly models language attributes and dispatches representations to corresponding language expert groups. The unsupervised router within each group implicitly models attributes beyond language, coordinating expert routing and collaboration. DLG-MoE outperforms existing MoE methods on CS-ASR tasks while showcasing flexibility in supporting different top-k inference, streaming capabilities, and parameter pruning to obtain a monolingual sub-model.
Low GrooveSquid.com (original content) Low Difficulty Summary
The Mixture of Experts (MoE) model is used for speech recognition. This paper makes the MoE better by adding something called Dynamic Language Group-based MoE or DLG-MoE. It’s like a special filter that helps the model understand different languages and work together. The new model does a better job at recognizing spoken language when people switch between languages. It can also be adjusted to do certain tasks, like only understanding one language.

Keywords

» Artificial intelligence  » Inference  » Mixture of experts  » Pruning  » Unsupervised