Loading Now

Summary of Contrastive and Consistency Learning For Neural Noisy-channel Model in Spoken Language Understanding, by Suyoung Kim et al.


Contrastive and Consistency Learning for Neural Noisy-Channel Model in Spoken Language Understanding

by Suyoung Kim, Jiyeon Hwang, Ho-Young Jung

First submitted to arxiv on: 23 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The recent surge in deep end-to-end learning for intent classification in Spoken Language Understanding (SLU) has led to the development of highly optimized models. However, these models require a large amount of speech data with intent labels and are sensitive to inconsistencies between training and evaluation conditions. To address this limitation, we propose a natural language understanding approach based on Automatic Speech Recognition (ASR). This module-based approach utilizes pre-trained general language models and adapts to the mismatch in speech input environments. We improve a noisy-channel model by handling transcription inconsistencies caused by ASR errors using a two-stage method called Contrastive and Consistency Learning (CCL). CCL correlates error patterns between clean and noisy ASR transcripts, emphasizing the consistency of latent features. Experimental results on four benchmark datasets show that CCL outperforms existing methods and improves ASR robustness in various noisy environments.
Low GrooveSquid.com (original content) Low Difficulty Summary
Recently, researchers have been studying how to use deep learning for understanding what people mean when they talk (SLU). However, this requires a lot of data with labels saying what the person meant. The problem is that these models are very sensitive to differences between when they were trained and when they’re used in real life. To fix this, we came up with an idea using Automatic Speech Recognition (ASR) which can use pre-trained language models and adjust to different environments. We also made a new method called CCL that helps the model deal with errors caused by ASR. We tested it on four datasets and showed that our approach works better than others.

Keywords

» Artificial intelligence  » Classification  » Deep learning  » Language understanding