Loading Now

Summary of Beyond Under-alignment: Atomic Preference Enhanced Factuality Tuning For Large Language Models, by Hongbang Yuan et al.


Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models

by Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu, Jun Zhao

First submitted to arxiv on: 18 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the effectiveness of preference learning in fine-tuning large language models (LLMs) to improve their factuality. The authors find that existing methods primarily evaluate model performance on in-domain datasets, neglecting out-of-domain (OOD) datasets where hallucination is a significant issue. Their experiments show that fine-tuned models often perform poorly on OOD datasets, with some even experiencing a decrease in factuality. Through analysis of token distribution shifts, the authors identify under-alignment as the primary cause of this failure. To address this, they propose APEFT (Atomic Preference Enhanced Factuality Tuning), a framework that enhances model awareness at the individual fact level. Extensive experiments demonstrate APEFT’s effectiveness, improving model performance by an average of 3.45% on both in-domain and OOD datasets.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine if you could teach computers to be more honest and accurate when generating text. That’s what this research paper is all about! The scientists found that even the best language models sometimes make things up, which they call “hallucination”. They wanted to see how well these models would do if they were trained to be more truthful. Surprisingly, most of them didn’t do much better on new, unfamiliar texts. The researchers figured out why this was happening and came up with a new way to improve the models’ honesty. They tested it and found that it made the models 3.45% more accurate overall.

Keywords

» Artificial intelligence  » Alignment  » Fine tuning  » Hallucination  » Token