Summary of Vocabulary-defined Semantics: Latent Space Clustering For Improving In-context Learning, by Jian Gu et al.
Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning
by Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang
First submitted to arxiv on: 29 Jan 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed approach, “vocabulary-defined semantics,” leverages language model (LM) vocabulary to improve in-context learning by aligning semantic properties between LMs and downstream data/tasks. By computing semantically equivalent latent representations for output labels based on LM vocabulary, a clustering operation is performed to address the hard-to-optimize issue of selecting demonstrations. The approach outperforms state-of-the-art methods across diverse textual understanding datasets and multiple models, achieving 3-49% improvements while reducing computation time by half. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary In-context learning lets language models adapt to new data or tasks using few examples as demonstrations within prompts. This method performs well without fine-tuning, but the quality of demonstrations affects performance. Some approaches index samples based on similarities between logits at the output-side. Despite these efforts, selecting good demonstrations remains challenging. To address this, a new approach combines in-context learning with clustering, focusing on LM vocabulary to match semantic properties between LMs and downstream data/tasks. This method works well across different datasets and models, achieving better results while reducing computation time. |
Keywords
* Artificial intelligence * Clustering * Fine tuning * Language model * Logits * Semantics