Summary of Xnlieu: a Dataset For Cross-lingual Nli in Basque, by Maite Heredia et al.
XNLIeu: a dataset for cross-lingual NLI in Basque
by Maite Heredia, Julen Etxaniz, Muitze Zulaika, Xabier Saralegi, Jeremy Barnes, Aitor Soroa
First submitted to arxiv on: 10 Apr 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Medium Difficulty summary: The XNLIeu dataset extends the popular Natural Language Inference (NLI) benchmark to Basque, a low-resource language that can benefit from transfer-learning approaches. The dataset was created by machine-translating the English XNLI corpus into Basque and manually editing it. To evaluate this new dataset, mono- and multilingual Large Language Models (LLMs) were used in experiments. The results show that professional post-edition of machine translation is necessary and that the translate-train cross-lingual strategy generally outperforms other approaches. However, when tested on a native Basque dataset, the gain was lower. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Low Difficulty summary: This research paper creates a new language learning tool for Basque, a lesser-known language. Basque is a great example of how machine learning can help languages that don’t have many resources. The researchers translated an existing language learning test into Basque and then reviewed it to make sure it was accurate. They tested different ways to use this new dataset with machine learning models and found that one approach worked best overall. However, when they tested the same models on a native Basque dataset, it didn’t perform as well. The researchers made their code and datasets available for others to use. |
Keywords
» Artificial intelligence » Inference » Machine learning » Transfer learning » Translation