Summary of Indic Qa Benchmark: a Multilingual Benchmark to Evaluate Question Answering Capability Of Llms For Indic Languages, by Abhishek Kumar Singh et al.

INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages

by Abhishek Kumar Singh, Vishwajeet kumar, Rudra Murthy, Jaydeep Sen, Ashish Mittal, Ganesh Ramakrishnan

First submitted to arxiv on: 18 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The abstract introduces the Indic QA Benchmark, a large dataset for context-grounded question answering in 11 major Indian languages, covering both extractive and abstractive tasks. Multilingual Large Language Models (LLMs) are evaluated, including instruction-finetuned versions, revealing weak performance in low-resource languages due to an English language bias in their training data. The Translate Test paradigm is also explored, translating inputs to English for processing and back into the source language for output, which outperforms multilingual LLMs, especially in low-resource settings.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The Indic QA Benchmark is a new dataset that can help us understand how well Large Language Models (LLMs) can answer questions in languages other than English. Right now, we don’t know much about how these models work in languages like Hindi or Bengali because there aren’t many tests to measure their performance. The researchers created a big test with lots of questions and answers in 11 Indian languages. They also tested some special kinds of LLMs that are trained on English data but can be fine-tuned for other languages. Unfortunately, these models didn’t do very well when they were asked questions in low-resource languages like Nepali or Punjabi because they’re biased towards English. The researchers found a different way to test the models that worked better, by translating the inputs into English and then back again. This could help us create better language models that can answer questions in many different languages.

Keywords

* Artificial intelligence * Question answering

INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages

by Abhishek Kumar Singh, Vishwajeet kumar, Rudra Murthy, Jaydeep Sen, Ashish Mittal, Ganesh Ramakrishnan

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Model-based Policy Optimization Using Symbolic World Model, by Andrey Gorodetskiy et al.

Summary of Discussion: Effective and Interpretable Outcome Prediction by Training Sparse Mixtures Of Linear Experts, By Francesco Folino et al.

Related Posts