Summary of Os-genesis: Automating Gui Agent Trajectory Construction Via Reverse Task Synthesis, by Qiushi Sun et al.
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
by Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu
First submitted to arxiv on: 27 Dec 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes OS-Genesis, a novel pipeline for collecting high-quality trajectory data to train Graphical User Interface (GUI) agents powered by Vision-Language Models (VLMs). The current methods rely on human supervision or synthetic data generation, which are resource-intensive or unable to guarantee data quality. OS-Genesis reverses the conventional process by enabling agents to perceive environments and perform step-wise interactions, then retrospectively deriving high-quality tasks for trajectory-level exploration. A trajectory reward model ensures the quality of generated trajectories. Training GUI agents with OS-Genesis significantly improves their performance on challenging online benchmarks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps create better computer-controlled robots that can work like humans. Right now, it’s hard to get good data to train these robots because we need to either have people supervise them or make fake data. This is time-consuming and doesn’t always work well. The researchers came up with a new way to collect data called OS-Genesis. Instead of following a plan, the robots can explore their environment and figure out what they want to do next. Then, the data can be used to train the robots to perform better tasks. |
Keywords
» Artificial intelligence » Synthetic data