Loading Now

Summary of Confidence-aware Reward Optimization For Fine-tuning Text-to-image Models, by Kyuyoung Kim et al.


Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models

by Kyuyoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, Kimin Lee

First submitted to arxiv on: 2 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Medium Difficulty summary: Fine-tuning text-to-image models with reward functions trained on human feedback data has been shown to align model behavior with human intent. However, using such proxy objectives can compromise model performance due to reward overoptimization. To investigate this issue, the authors introduce the Text-Image Alignment Assessment (TIA2) benchmark, comprising a diverse collection of text prompts, images, and human annotations. The evaluation reveals frequent misalignment between state-of-the-art reward models and human assessment. The study empirically demonstrates that overoptimization occurs when a poorly aligned reward model is used as the fine-tuning objective. To address this, the authors propose TextNorm, a simple method enhancing alignment based on reward model confidence estimated across semantically contrastive text prompts. This results in twice as many wins in human evaluation for text-image alignment compared to baseline reward models.
Low GrooveSquid.com (original content) Low Difficulty Summary
Low Difficulty summary: Scientists have been trying to get computers to understand what we mean when we describe images. They use a special trick called “reward functions” to help the computer learn. But sometimes this trick can actually make things worse, not better. The researchers created a test to see how well these rewards work and found that they often don’t match what humans think is correct. To fix this problem, they came up with a new way of using these rewards that makes them more accurate. This helps the computer learn even better than before!

Keywords

* Artificial intelligence  * Alignment  * Fine tuning