Summary of Towards Santali Linguistic Inclusion: Building the First Santali-to-english Translation Model Using Mt5 Transformer and Data Augmentation, by Syed Mohammed Mostaque Billah et al.
Towards Santali Linguistic Inclusion: Building the First Santali-to-English Translation Model using mT5 Transformer and Data Augmentation
by Syed Mohammed Mostaque Billah, Ateya Ahmed Subarna, Sudipta Nandi Sarna, Ahmad Shawkat Wasit, Anika Fariha, Asif Sushmit, Arig Yousuf Sadeque
First submitted to arxiv on: 29 Nov 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed paper aims to address the lack of translation models for the Santali language by exploring the feasibility of building a functional machine translation model for this low-resource language. To achieve this goal, the authors examine the performance of different parallel corpora and transformer architectures in translating Santali text into English. The study finds that transfer learning can be a viable technique for the Santali language, with the mt5 transformer outperforming untrained transformers. Additionally, the paper shows that using data augmentation techniques can further improve model performance. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is trying to help people who speak the Santali language by creating a way for them to translate what they say into other languages like English. Right now, there isn’t a good machine translation model for Santali, which makes it hard for people to understand each other across different languages. The researchers are looking at different ways to make this work and found that using special computer programs called transformers can help make the translations better. |
Keywords
» Artificial intelligence » Data augmentation » Transfer learning » Transformer » Translation