Summary of T-reg: Preference Optimization with Token-level Reward Regularization, by Wenxuan Zhou et al.
T-REG: Preference Optimization with Token-Level Reward Regularization
by Wenxuan Zhou, Shujian Zhang, Lingxiao Zhao, Tao Meng
First submitted to arxiv on: 3 Dec 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed token-level reward regularization (T-REG) approach leverages the self-refinement capabilities of large language models (LLMs) to enable them to generate token-level rewards. This method uses contrastive prompting, allowing LLMs to optimize preference alignment by distributing sequence-level rewards across tokens. T-REG outperforms baseline methods on instruction following benchmarks, including Alpaca Eval 2 and Arena-Hard, with up to 3.8% and 4.4% improvements, respectively. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models need help understanding what we want from them. Currently, they’re given a single reward for the whole response, which isn’t very helpful. Some methods try to improve this by giving rewards for individual words, but these methods rely on special training or human helpers. This paper proposes a new way of giving rewards called token-level reward regularization (T-REG). It uses something called contrastive prompting that lets the model figure out how to give rewards to individual words itself. This helps the model learn better and make more accurate predictions. |
Keywords
» Artificial intelligence » Alignment » Prompting » Regularization » Token