Summary of Development Of Pre-trained Transformer-based Models For the Nepali Language, by Prajwal Thapa et al.
Development of Pre-Trained Transformer-based Models for the Nepali Language
by Prajwal Thapa, Jinu Nyachhyon, Mridul Sharma, Bal Krishna Bal
First submitted to arxiv on: 24 Nov 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper addresses the underrepresentation of the Nepali language in natural language processing (NLP) by pre-training three transformer-based models – BERT, RoBERTa, and GPT-2 – exclusively for the Nepali language. The authors collected a large dataset of 27.5 GB of Nepali text data, which is approximately 2.4 times larger than any previously available Nepali corpus. The paper explores instruction tuning on monolingual Nepali data and provides a foundation for future research. The pre-trained models outperformed the existing best model by 2 points on the Nep-gLUE benchmark, scoring 95.60, and also showed improvements in both understanding and generating Nepali text. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper is about making computers better at understanding and writing in the Nepali language. Right now, most computer models are trained on English data, so they’re not very good at understanding or speaking Nepali. The researchers collected a huge amount of Nepali text data and used it to train three special models that can understand and generate Nepali text. They also tested these models and found that they were better than previous models at doing things like answering questions and writing new text in Nepali. |
Keywords
» Artificial intelligence » Bert » Gpt » Instruction tuning » Natural language processing » Nlp » Transformer