Summary of Confidence-aware Reward Optimization For Fine-tuning Text-to-image Models, by Kyuyoung Kim et al.
Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
by Kyuyoung Kim, Jongheon Jeong, Minyong An, Mohammad Ghavamzadeh, Krishnamurthy Dvijotham, Jinwoo Shin, Kimin Lee
First submitted to arxiv on: 2 Apr 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Medium Difficulty summary: Fine-tuning text-to-image models with reward functions trained on human feedback data has been shown to align model behavior with human intent. However, using such proxy objectives can compromise model performance due to reward overoptimization. To investigate this issue, the authors introduce the Text-Image Alignment Assessment (TIA2) benchmark, comprising a diverse collection of text prompts, images, and human annotations. The evaluation reveals frequent misalignment between state-of-the-art reward models and human assessment. The study empirically demonstrates that overoptimization occurs when a poorly aligned reward model is used as the fine-tuning objective. To address this, the authors propose TextNorm, a simple method enhancing alignment based on reward model confidence estimated across semantically contrastive text prompts. This results in twice as many wins in human evaluation for text-image alignment compared to baseline reward models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Low Difficulty summary: Scientists have been trying to get computers to understand what we mean when we describe images. They use a special trick called “reward functions” to help the computer learn. But sometimes this trick can actually make things worse, not better. The researchers created a test to see how well these rewards work and found that they often don’t match what humans think is correct. To fix this problem, they came up with a new way of using these rewards that makes them more accurate. This helps the computer learn even better than before! |
Keywords
* Artificial intelligence * Alignment * Fine tuning