Summary of Do “english” Named Entity Recognizers Work Well on Global Englishes?, by Alexander Shan et al.

Do “English” Named Entity Recognizers Work Well on Global Englishes?

by Alexander Shan, John Bauer, Riley Carlson, Christopher Manning

First submitted to arxiv on: 20 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper explores the limitations of popular English named entity recognition (NER) datasets in analyzing global varieties of English. It proposes a newswire dataset, the Worldwide English NER Dataset, to assess the performance of widely used NER toolkits and transformer models on low-resource English variants from around the world. The results show that models trained on commonly used British English or American-focused datasets experience significant performance drops when tested on the global dataset, with the greatest declines observed in Oceania and Africa. However, a combined model trained on the global dataset and either CoNLL or OntoNotes maintains strong performance on both test sets.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper is about how well computer models can understand different types of English words. Right now, most language models are only good at understanding American or British English, even though there are many other ways people speak English around the world. The researchers created a new dataset with news articles from all over the world to test how well these models do on global English varieties. They found that most models don’t do very well when tested on this new dataset, especially for languages like those spoken in Oceania and Africa. However, they were able to create a combined model that works well on both American and global English texts.

Keywords

* Artificial intelligence * Named entity recognition * Ner * Transformer

Do “English” Named Entity Recognizers Work Well on Global Englishes?

by Alexander Shan, John Bauer, Riley Carlson, Christopher Manning

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Unibucllm: Harnessing Llms For Automated Prediction Of Item Difficulty and Response Time For Multiple-choice Questions, by Ana-cristina Rogoz et al.

Summary of Preconditioned Neural Posterior Estimation For Likelihood-free Inference, by Xiaoyu Wang et al.

Related Posts