Loading Now

Summary of Development Of Pre-trained Transformer-based Models For the Nepali Language, by Prajwal Thapa et al.


Development of Pre-Trained Transformer-based Models for the Nepali Language

by Prajwal Thapa, Jinu Nyachhyon, Mridul Sharma, Bal Krishna Bal

First submitted to arxiv on: 24 Nov 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper addresses the underrepresentation of the Nepali language in natural language processing (NLP) by pre-training three transformer-based models – BERT, RoBERTa, and GPT-2 – exclusively for the Nepali language. The authors collected a large dataset of 27.5 GB of Nepali text data, which is approximately 2.4 times larger than any previously available Nepali corpus. The paper explores instruction tuning on monolingual Nepali data and provides a foundation for future research. The pre-trained models outperformed the existing best model by 2 points on the Nep-gLUE benchmark, scoring 95.60, and also showed improvements in both understanding and generating Nepali text.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about making computers better at understanding and writing in the Nepali language. Right now, most computer models are trained on English data, so they’re not very good at understanding or speaking Nepali. The researchers collected a huge amount of Nepali text data and used it to train three special models that can understand and generate Nepali text. They also tested these models and found that they were better than previous models at doing things like answering questions and writing new text in Nepali.

Keywords

» Artificial intelligence  » Bert  » Gpt  » Instruction tuning  » Natural language processing  » Nlp  » Transformer