Summary of Evaluating Named Entity Recognition: a Comparative Analysis Of Mono- and Multilingual Transformer Models on a Novel Brazilian Corporate Earnings Call Transcripts Dataset, by Ramon Abilio et al.
Evaluating Named Entity Recognition: A comparative analysis of mono- and multilingual transformer models on a novel Brazilian corporate earnings call transcripts dataset
by Ramon Abilio, Guilherme Palermo Coelho, Ana Estela Antunes da Silva
First submitted to arxiv on: 18 Mar 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel study investigates the applicability of pre-trained transformer-based models to Brazilian Portuguese, with a focus on fine-tuning for financial Named Entity Recognition (NER) tasks. Researchers identify four models: two native Brazilian Portuguese models (BERTimbau and PTT5) and two multilingual models (mBERT and mT5). They evaluate these models’ performance on the newly created BraFiNER dataset, comprising earnings calls transcripts from Brazilian banks, using a weakly supervised approach. The study also introduces a novel text generation-based approach to reframing token classification tasks. After fine-tuning, the models are evaluated using performance and error metrics. Results show that BERT-based models consistently outperform T5-based models, with BERTimbau demonstrating superior performance over PTT5 in terms of macro F1-scores. Error metrics also favor BERTimbau. Additionally, the study highlights the importance of accuracy and consistency in financial applications, as demonstrated by PTT5 and mT5 generating sentences with changes in monetary and percentage values. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary A new study looks at using special language models to help machines understand Brazilian Portuguese. The researchers want to see if these models can be used for a specific task called Named Entity Recognition (NER) on financial documents from Brazil. They test four different models: two that were trained just for Brazilian Portuguese and two that can handle multiple languages. To evaluate the models, they use a special dataset they created with sentences from earnings calls by Brazilian banks. The study also introduces a new way to approach NER tasks. After fine-tuning the models, they compare their performance using different measures. The results show that some models are better than others at this task, and one model in particular stands out as very good. |
Keywords
* Artificial intelligence * Bert * Classification * Fine tuning * Named entity recognition * Ner * Supervised * T5 * Text generation * Token * Transformer