Loading Now

Summary of Calico: Conversational Agent Localization Via Synthetic Data Generation, by Andy Rosenbaum et al.


CALICO: Conversational Agent Localization via Synthetic Data Generation

by Andy Rosenbaum, Pegah Kharazmi, Ershad Banijamali, Lu Zeng, Christopher DiPersio, Pan Wei, Gokmen Oz, Clement Chung, Karolina Owczarzak, Fabian Triefenbach, Wael Hamza

First submitted to arxiv on: 6 Dec 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper introduces CALICO, a novel method for fine-tuning Large Language Models (LLMs) to localize conversational agent training data across languages. The approach enables three operations: verbatim copy, literal translation, and localization of slot values. For instance, city and airport names can be generated in the target language. To improve performance, CALICO employs an iterative filtering mechanism to discard noisy samples. The effectiveness of CALICO is demonstrated by building a new human-localized (HL) version of the MultiATIS++ travel information test set in 8 languages, which outperforms the original human-translated (HT) version and state-of-the-art LINGUIST.
Low GrooveSquid.com (original content) Low Difficulty Summary
CALICO is a way to help computers understand conversations better. It takes language models and makes them work with different languages. The method can copy words, translate them literally, or even generate new words that fit the target language. To make sure the results are good, CALICO gets rid of bad samples. This helps create more accurate translations. The team tested this approach by making a new test set in 8 languages and showed it works better than other methods.

Keywords

» Artificial intelligence  » Fine tuning  » Translation