Loading Now

Summary of How Does Multi-task Training Affect Transformer In-context Capabilities? Investigations with Function Classes, by Harmon Bhasin et al.


How does Multi-Task Training Affect Transformer In-Context Capabilities? Investigations with Function Classes

by Harmon Bhasin, Timothy Ossowski, Yiqiao Zhong, Junjie Hu

First submitted to arxiv on: 4 Apr 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores the combination of multi-task learning (MTL) and in-context learning (ICL) for large language models (LLMs). The authors propose several curriculum learning strategies to train ICL models that learn tasks efficiently while being robust to out-of-distribution examples. Experimental results show that ICL models can effectively learn difficult tasks by training on progressively harder tasks, mixing in prior tasks, and achieving higher data efficiency and more stable convergence.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about how computers can learn new skills quickly just from a few examples. The researchers want to make these computers even better at learning new things. They try different ways of training the computers and find that by teaching them multiple tasks together, they become really good at learning new things. This means they can do lots of different tasks well without needing as much data or training time.

Keywords

* Artificial intelligence  * Curriculum learning  * Multi task