Summary of Multilingual Coreference Resolution in Low-resource South Asian Languages, by Ritwik Mishra et al.
Multilingual Coreference Resolution in Low-resource South Asian Languages
by Ritwik Mishra, Pooja Desur, Rajiv Ratn Shah, Ponnurangam Kumaraguru
First submitted to arxiv on: 21 Feb 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed TransMuCoRes dataset addresses the scarcity of publicly accessible resources and models for coreference resolution in South Asian languages. The dataset is translated using off-the-shelf tools, yielding nearly all predicted translations that pass a sanity check, with 75% aligning with their English references. Two multilingual encoders are trained on TransMuCoRes and a Hindi annotated dataset, achieving scores of 64 for LEA F1 and 68 for CoNLL F1 on the Hindi golden set. This work evaluates an end-to-end coreference resolution model on Hindi, highlighting limitations in current evaluation metrics when applied to datasets with split antecedents. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps solve a problem: figuring out what text refers to real-world things. Most work has been done in English, but there’s not much help for other languages like those spoken in South Asia. The researchers created a big dataset that can be used by anyone to try and fix this problem. They translated the data from many languages using machines, which worked pretty well. Then they trained special computers to understand coreference resolution and tested them on Hindi text. It’s the first time someone has tried to do this kind of thing with Hindi. The results show that we still need better ways to measure how good these computer models are. |
Keywords
» Artificial intelligence » Coreference