Summary of Beyond Thumbs Up/down: Untangling Challenges Of Fine-grained Feedback For Text-to-image Generation, by Katherine M. Collins et al.
Beyond Thumbs Up/Down: Untangling Challenges of Fine-Grained Feedback for Text-to-Image Generation
by Katherine M. Collins, Najoung Kim, Yonatan Bitton, Verena Rieser, Shayegan Omidshafiei, Yushi Hu, Sherol Chen, Senjuti Dutta, Minsuk Chang, Kimin Lee, Youwei Liang, Georgina Evans, Sahil Singla, Gang Li, Adrian Weller, Junfeng He, Deepak Ramachandran, Krishnamurthy Dj Dvijotham
First submitted to arxiv on: 24 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the effectiveness of fine-grained feedback in learning accurate reward functions for text-to-image generation, compared to traditional coarse-grained feedback. It proposes that capturing nuanced distinctions in image quality and prompt-alignment can improve model performance, particularly for systems catering to diverse societal preferences. However, demonstrating its superiority is not automatic, as experiments on real and synthetic preference data reveal the complexities of building effective models due to interplay between model choice, feedback type, and human-judgment alignment. The findings suggest that fine-grained feedback can lead to worse models in some settings, but it can be more helpful in controlled settings with known attributes. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper explores how humans give feedback when creating images from text descriptions. It wants to know what kind of feedback works best for making good image generation models. The researchers compare two kinds of feedback: a simple thumbs up or down, and more detailed ratings that take into account the quality of the generated image and its match with the original text prompt. They find that while fine-grained feedback seems promising, it’s not always better than coarse-grained feedback. Sometimes, using fine-grained feedback can even make things worse! The results show that finding the right kind of feedback is important, and we need to consider what kinds of attributes are most useful when giving feedback. |
Keywords
» Artificial intelligence » Alignment » Image generation » Prompt