Summary of Leveraging Human Revisions For Improving Text-to-layout Models, by Amber Xie et al.
Leveraging Human Revisions for Improving Text-to-Layout Models
by Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang Li
First submitted to arxiv on: 16 May 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a new approach to aligning large-scale generative models with human values by leveraging nuanced feedback through human revisions. Building on prior works that focused on high-level labels, the authors use expert designers to fix layouts generated from a pre-trained generative layout model. The learned reward model is then used to optimize the original model using reinforcement learning from human feedback (RLHF). The resulting method, Revision-Aware Reward Models (), enables a text-to-layout model to produce more modern and designer-aligned layouts. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper shows how we can make computers better understand what humans want by giving them more detailed feedback. Right now, most AI models are trained using simple labels or preferences, but this doesn’t always get the job done. The authors of this paper asked expert designers to improve on generated layouts, and then used this feedback to train a new model that produces better results. This could be really useful for making AI models more helpful in areas like design, where humans have complex needs. |
Keywords
» Artificial intelligence » Reinforcement learning from human feedback » Rlhf