Loading Now

Summary of Filtered Corpus Training (fict) Shows That Language Models Can Generalize From Indirect Evidence, by Abhinav Patil and Jaap Jumelet and Yu Ying Chiu and Andy Lapastora and Peter Shen and Lexie Wang and Clevis Willrich and Shane Steinert-threlkeld


Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence

by Abhinav Patil, Jaap Jumelet, Yu Ying Chiu, Andy Lapastora, Peter Shen, Lexie Wang, Clevis Willrich, Shane Steinert-Threlkeld

First submitted to arxiv on: 24 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes Filtered Corpus Training, a technique to train language models (LMs) on filtered corpora, which helps measure their ability to generalize linguistically through indirect evidence. The method is applied to both LSTM and Transformer LMs, creating filtered datasets targeting various linguistic phenomena. Surprisingly, the results show that while Transformers excel as LMs, both models perform equally well in linguistic generalization, indicating they can learn from indirect cues.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper helps us understand how language models work by training them on special datasets without certain types of words or phrases. The goal is to see if these trained models can pick up new language rules even when they haven’t seen those specific words before. The researchers tested two different kinds of models, one using an old way called LSTMs and the other using a newer method called Transformers. They found that both types of models did surprisingly well at learning from hints, not just memorizing what they were shown.

Keywords

» Artificial intelligence  » Generalization  » Lstm  » Transformer