Summary of Uncovering Factor Level Preferences to Improve Human-model Alignment, by Juhyun Oh et al.

Uncovering Factor Level Preferences to Improve Human-Model Alignment

by Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, Inha Cha, William Yang Wang, Alice Oh

First submitted to arxiv on: 9 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper introduces PROFILE, a novel framework that uncovers and quantifies the influence of specific factors driving Large Language Model (LLM) preferences. By analyzing these factors at a granular level, PROFILE explains why LLMs often exhibit biases or tendencies diverging from human preferences. The authors apply PROFILE to three tasks: summarization, helpful response generation, and document-based question-answering. Their analysis reveals significant discrepancies between human and LLM preferences in generation tasks, but strong alignment in evaluation tasks. The work highlights the importance of explainable preference analysis and demonstrates how leveraging factor level insights can improve alignment with human preferences.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper is about understanding why large language models make certain choices or have specific preferences. These models often make decisions that are different from what humans would prefer, like writing in a style that’s too fancy. Right now, it’s hard to understand why this happens because we’re not using good methods to compare human and model preferences. The authors introduce a new way called PROFILE that helps us see which specific factors are driving these differences. They use PROFILE on three different tasks and find out that models tend to be more different from humans when they’re generating text, but are more similar when evaluating it. This matters because understanding what’s going wrong can help improve the models so they make better choices.

Keywords

* Artificial intelligence * Alignment * Large language model * Question answering * Summarization

Uncovering Factor Level Preferences to Improve Human-Model Alignment

by Juhyun Oh, Eunsu Kim, Jiseon Kim, Wenda Xu, Inha Cha, William Yang Wang, Alice Oh

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Learning Evolving Tools For Large Language Models, by Guoxin Chen et al.

Summary of Cross-task Pretraining For Cross-organ Cross-scanner Adenocarcinoma Segmentation, by Adrian Galdran

Related Posts