Loading Now

Summary of Multilingual Coreference Resolution in Low-resource South Asian Languages, by Ritwik Mishra et al.


Multilingual Coreference Resolution in Low-resource South Asian Languages

by Ritwik Mishra, Pooja Desur, Rajiv Ratn Shah, Ponnurangam Kumaraguru

First submitted to arxiv on: 21 Feb 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed TransMuCoRes dataset addresses the scarcity of publicly accessible resources and models for coreference resolution in South Asian languages. The dataset is translated using off-the-shelf tools, yielding nearly all predicted translations that pass a sanity check, with 75% aligning with their English references. Two multilingual encoders are trained on TransMuCoRes and a Hindi annotated dataset, achieving scores of 64 for LEA F1 and 68 for CoNLL F1 on the Hindi golden set. This work evaluates an end-to-end coreference resolution model on Hindi, highlighting limitations in current evaluation metrics when applied to datasets with split antecedents.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper helps solve a problem: figuring out what text refers to real-world things. Most work has been done in English, but there’s not much help for other languages like those spoken in South Asia. The researchers created a big dataset that can be used by anyone to try and fix this problem. They translated the data from many languages using machines, which worked pretty well. Then they trained special computers to understand coreference resolution and tested them on Hindi text. It’s the first time someone has tried to do this kind of thing with Hindi. The results show that we still need better ways to measure how good these computer models are.

Keywords

» Artificial intelligence  » Coreference