Loading Now

Summary of Vocabulary-defined Semantics: Latent Space Clustering For Improving In-context Learning, by Jian Gu et al.


Vocabulary-Defined Semantics: Latent Space Clustering for Improving In-Context Learning

by Jian Gu, Aldeida Aleti, Chunyang Chen, Hongyu Zhang

First submitted to arxiv on: 29 Jan 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed approach, “vocabulary-defined semantics,” leverages language model (LM) vocabulary to improve in-context learning by aligning semantic properties between LMs and downstream data/tasks. By computing semantically equivalent latent representations for output labels based on LM vocabulary, a clustering operation is performed to address the hard-to-optimize issue of selecting demonstrations. The approach outperforms state-of-the-art methods across diverse textual understanding datasets and multiple models, achieving 3-49% improvements while reducing computation time by half.
Low GrooveSquid.com (original content) Low Difficulty Summary
In-context learning lets language models adapt to new data or tasks using few examples as demonstrations within prompts. This method performs well without fine-tuning, but the quality of demonstrations affects performance. Some approaches index samples based on similarities between logits at the output-side. Despite these efforts, selecting good demonstrations remains challenging. To address this, a new approach combines in-context learning with clustering, focusing on LM vocabulary to match semantic properties between LMs and downstream data/tasks. This method works well across different datasets and models, achieving better results while reducing computation time.

Keywords

* Artificial intelligence  * Clustering  * Fine tuning  * Language model  * Logits  * Semantics