Loading Now

Summary of Grath: Gradual Self-truthifying For Large Language Models, by Weixin Chen et al.


GRATH: Gradual Self-Truthifying for Large Language Models

by Weixin Chen, Dawn Song, Bo Li

First submitted to arxiv on: 22 Jan 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed GRAdual self-truTHifying (GRATH) method enhances the truthfulness of large language models (LLMs) by generating pairwise truthfulness training data using out-of-domain question prompts and optimizing the model via direct preference optimization (DPO). GRATH iteratively refines truthfulness data and updates the model, leading to a gradual improvement in model truthfulness. Empirically, GRATH achieves state-of-the-art performance on TruthfulQA, with MC1 accuracy of 54.71% and MC2 accuracy of 69.10%, surpassing even larger-sized LLMs.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large language models (LLMs) need to generate truthful content, but they still struggle. The problem is that these models don’t do well on tests like TruthfulQA. To fix this, scientists came up with a new way to improve the truthfulness of LLMs called GRATH. It uses special questions to create training data and makes the model better at telling the truth by comparing answers. As it gets more training, GRATH helps models become more truthful without sacrificing other important skills.

Keywords

» Artificial intelligence  » Optimization