Summary of Aligning Large Language Models by On-policy Self-judgment, By Sangkyu Lee et al.

Aligning Large Language Models by On-Policy Self-Judgment

by Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu

First submitted to arxiv on: 17 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper presents a novel alignment framework called SELF-JUDGE that enables on-policy learning without requiring a separate reward model (RM). The framework uses Judge-augmented Supervised Fine-Tuning (JSFT) to train a single model that acts as both policy and judge. By viewing the pairwise judgment task as a special case of instruction-following, the model can evaluate preferences for on-the-fly responses generated by its current policy. Experimental results show that SELF-JUDGE outperforms baselines in preference benchmarks.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper is about creating a new way to teach machines to make good choices. Right now, we need two different models: one to decide what actions to take and another to judge whether those actions are good or bad. The authors of this paper came up with a new idea that lets us use just one model for both tasks. They called it SELF-JUDGE. It works by treating the task of choosing between two responses as a special kind of instruction-following problem. This means the model can learn to make good choices without needing an extra judge. The results show that this approach is better than other methods.

Keywords

* Artificial intelligence * Alignment * Fine tuning * Supervised

Aligning Large Language Models by On-Policy Self-Judgment

by Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Knowledge Distillation Based on Transformed Teacher Matching, by Kaixiang Zheng and En-hui Yang

Summary of Ransomware Detection Using Stacked Autoencoder For Feature Selection, by Mike Nkongolo and Mahmut Tokmak

Related Posts