Summary of Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?, by Nicy Scaria et al.

Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

by Nicy Scaria, Silvester John Joseph Kennedy, Deepak Subramani

First submitted to arxiv on: 1 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Medium Difficulty summary: This study explores the capabilities of Small Language Models (SLMs) in learning, retaining, and eliminating different types of noise in data. Four pre-trained SLMs with parameters ranging from 1-3 billion were used, including Olmo 1B, Qwen1.5 1.8B, Gemma 2B, and Phi2 2.7B. The models were instruction-tuned on noise-free data and tested using in-context examples to evaluate their ability to learn from noisy patterns. Results show that Olmo is highly sensitive to noise, quickly adapting to noisy patterns, while Phi2 resists learning character-level and transliteration noise due to its high-quality pretraining data. Gemma excels with transliteration noise, likely benefiting from its multilingual pretraining. The findings can be used to develop robust training strategies for SLMs.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Low Difficulty summary: This study looks at how Small Language Models (SLMs) deal with noisy data. Researchers tested four different SLMs on different types of noise and found that each model handled noise differently. One model, Olmo, was very sensitive to noise and learned quickly. Another model, Phi2, did well with certain kinds of noise because its training data was high-quality. The study’s results can help developers create better training strategies for SLMs.

Keywords

* Artificial intelligence * Pretraining

Can Small Language Models Learn, Unlearn, and Retain Noise Patterns?

by Nicy Scaria, Silvester John Joseph Kennedy, Deepak Subramani

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Hybrid Rag-empowered Multi-modal Llm For Secure Data Management in Internet Of Medical Things: a Diffusion-based Contract Approach, by Cheng Su et al.

Summary of Curls: Causal Rule Learning For Subgroups with Significant Treatment Effect, by Jiehui Zhou et al.

Related Posts