Loading Now

Summary of Defsent+: Improving Sentence Embeddings Of Language Models by Projecting Definition Sentences Into a Quasi-isotropic or Isotropic Vector Space Of Unlimited Dictionary Entries, By Xiaodong Liu


DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries

by Xiaodong Liu

First submitted to arxiv on: 25 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper presents a significant improvement over its previous conference paper, DefSent, which aimed to improve sentence embeddings of language models by projecting definition sentences into the vector space of dictionary entries. The authors discover that this approach has methodological limitations, including constraining dictionary entries with single-word vocabularies and using anisotropic semantic representations from language models. To overcome these limitations, they propose a novel method to progressively build entry embeddings not subject to these constraints. This approach, dubbed DefSent+, achieves noticeably better quality sentence embeddings and state-of-the-art performance on measuring sentence similarities when used to further train data-augmented models. Additionally, DefSent+ is competitive in feature-based transfer for NLP downstream tasks.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper improves upon a previous study called DefSent by fixing its limitations. The old way of projecting definition sentences into dictionary entries has problems that make it hard to use. This new method builds dictionary entry embeddings from scratch and does away with the single-word vocabulary constraint. As a result, sentence embeddings get better quality. This approach is called DefSent+ and it’s really good at measuring sentence similarities. It even beats other methods without using labeled datasets! Plus, it works well for transferring features to other NLP tasks.

Keywords

» Artificial intelligence  » Nlp  » Vector space