Summary of Khayyam Challenge (persianmmlu): Is Your Llm Truly Wise to the Persian Language?, by Omid Ghahroodi et al.

Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?

by Omid Ghahroodi, Marzia Nouri, Mohammad Vali Sanian, Alireza Sahebi, Doratossadat Dastgheib, Ehsaneddin Asgari, Mahdieh Soleymani Baghshah, Mohammad Hossein Rohban

First submitted to arxiv on: 9 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The Khayyam Challenge is a newly introduced evaluation methodology for Large Language Models (LLMs) that supports the Persian language. The challenge comprises 20,192 four-choice questions sourced from 38 diverse tasks extracted from Persian examinations, covering various subjects, complexities, and ages. This comprehensive benchmark aims to assess different facets of LLMs, such as language comprehension, reasoning, and information retrieval across educational stages from lower primary school to upper secondary school. The Khayyam Challenge features a range of distinctive characteristics, including its coverage of various topics, rich metadata, use of new data to avoid contamination issues, and utilization of original, non-translated data tailored for Persian speakers. This framework is free from translation challenges and errors while encompassing cultural nuances. The challenge’s scalability allows for future updates and evaluations without requiring special human effort. The paper evaluates a wide range of existing LLMs that support the Persian language, providing statistical analyses and interpretations of their outputs.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The Khayyam Challenge is a new way to test how well computers can understand the Persian language. It’s like a big quiz with 20,192 questions! These questions come from different tests and exams in Persian schools, covering many subjects like math, science, and literature. The goal is to see how good computer models are at understanding Persian text and answering questions. This challenge is special because it includes lots of extra information, like how hard each question is and what the correct answers are. It’s also designed just for Persian speakers, so there aren’t any translation problems. This makes it easier to compare different computer models and see which ones do best. The researchers tested many existing computer models that can understand Persian text and looked at their results. They want to make sure these models are as good as they can be!

Keywords

* Artificial intelligence * Translation

Khayyam Challenge (PersianMMLU): Is Your LLM Truly Wise to The Persian Language?

by Omid Ghahroodi, Marzia Nouri, Mohammad Vali Sanian, Alireza Sahebi, Doratossadat Dastgheib, Ehsaneddin Asgari, Mahdieh Soleymani Baghshah, Mohammad Hossein Rohban

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Event Extraction in Basque: Typologically Motivated Cross-lingual Transfer-learning Analysis, by Mikel Zubillaga et al.

Summary of Metacheckgpt — a Multi-task Hallucination Detector Using Llm Uncertainty and Meta-models, by Rahul Mehta et al.

Related Posts