Summary of Multilingual Coreference Resolution in Low-resource South Asian Languages, by Ritwik Mishra et al.

Multilingual Coreference Resolution in Low-resource South Asian Languages

by Ritwik Mishra, Pooja Desur, Rajiv Ratn Shah, Ponnurangam Kumaraguru

First submitted to arxiv on: 21 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed TransMuCoRes dataset addresses the scarcity of publicly accessible resources and models for coreference resolution in South Asian languages. The dataset is translated using off-the-shelf tools, yielding nearly all predicted translations that pass a sanity check, with 75% aligning with their English references. Two multilingual encoders are trained on TransMuCoRes and a Hindi annotated dataset, achieving scores of 64 for LEA F1 and 68 for CoNLL F1 on the Hindi golden set. This work evaluates an end-to-end coreference resolution model on Hindi, highlighting limitations in current evaluation metrics when applied to datasets with split antecedents.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper helps solve a problem: figuring out what text refers to real-world things. Most work has been done in English, but there’s not much help for other languages like those spoken in South Asia. The researchers created a big dataset that can be used by anyone to try and fix this problem. They translated the data from many languages using machines, which worked pretty well. Then they trained special computers to understand coreference resolution and tested them on Hindi text. It’s the first time someone has tried to do this kind of thing with Hindi. The results show that we still need better ways to measure how good these computer models are.

Keywords

» Artificial intelligence » Coreference

Multilingual Coreference Resolution in Low-resource South Asian Languages

by Ritwik Mishra, Pooja Desur, Rajiv Ratn Shah, Ponnurangam Kumaraguru

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Exploring the Limits Of Semantic Image Compression at Micro-bits Per Pixel, by Jordan Dotzel et al.

Summary of Can Watermarks Survive Translation? on the Cross-lingual Consistency Of Text Watermark For Large Language Models, by Zhiwei He et al.

Related Posts