Summary of Generating Bilingual Example Sentences with Large Language Models As Lexicography Assistants, by Raphael Merx et al.
Generating bilingual example sentences with large language models as lexicography assistants
by Raphael Merx, Ekaterina Vylomova, Kemal Kurniawan
First submitted to arxiv on: 4 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper explores the performance of Large Language Models (LLMs) in generating example sentences for bilingual dictionaries across three languages: French (high-resource), Indonesian (mid-resource), and Tetun (low-resource). The authors evaluate the quality of LLM-generated examples against the GDEX criteria, which assess typicality, informativeness, and intelligibility. They find that while LLMs can generate good dictionary examples, their performance degrades for lower-resourced languages. Additionally, they observe high variability in human preferences for example quality and demonstrate that in-context learning can align LLMs with individual annotator preferences. Furthermore, the authors explore the use of pre-trained language models for automated rating of examples, finding that sentence perplexity is a good proxy for typicality and intelligibility in higher-resourced languages. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research looks at how well Large Language Models can help create example sentences for dictionaries that translate words from one language to another. The study tested the models on three languages: French, Indonesian, and Tetun (which has limited resources). They found that while the models can do a good job, they struggle with languages that don’t have much information available. The researchers also discovered that people have different opinions about what makes a good example sentence, which makes it hard to create consistent ratings. To solve this problem, they showed that training the models to learn from context can help them agree with individual raters. This study contributes a new dataset of 600 ratings for LLM-generated sentences and shows how these models can help reduce the cost of creating dictionaries, especially for languages with limited resources. |
Keywords
» Artificial intelligence » Perplexity