Summary of In-context Learning in Presence Of Spurious Correlations, by Hrayr Harutyunyan et al.
In-context Learning in Presence of Spurious Correlations
by Hrayr Harutyunyan, Rafayel Darbinyan, Samvel Karapetyan, Hrant Khachatrian
First submitted to arxiv on: 4 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this research paper, the authors investigate the possibility of training large language models for classification tasks involving spurious features using in-context learning. They find that conventional approaches are susceptible to these features and often lead to task memorization rather than leveraging context for predictions. The authors propose a novel technique to train an in-context learner for a specific classification task, which surprisingly matches or outperforms strong methods like ERM and GroupDRO. However, this learner does not generalize well to other tasks. To address this limitation, the authors show that training on a diverse dataset of synthetic in-context learning instances enables an in-context learner to generalize to unseen tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary In-context learners are really good at solving problems when given some examples. Researchers have been trying to use these models for classification tasks, but they found that they can get tricked by fake features. They also discovered that training these models only on one task makes them forget about the context and just memorize the task instead of using it to make predictions. The authors came up with a new way to train these learners, which surprisingly works really well for specific tasks! However, this learner doesn’t work well for other tasks. To fix this problem, they showed that training on many different examples helps these learners generalize to new problems. |
Keywords
* Artificial intelligence * Classification