Summary of Os-genesis: Automating Gui Agent Trajectory Construction Via Reverse Task Synthesis, by Qiushi Sun et al.

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

by Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu

First submitted to arxiv on: 27 Dec 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper proposes OS-Genesis, a novel pipeline for collecting high-quality trajectory data to train Graphical User Interface (GUI) agents powered by Vision-Language Models (VLMs). The current methods rely on human supervision or synthetic data generation, which are resource-intensive or unable to guarantee data quality. OS-Genesis reverses the conventional process by enabling agents to perceive environments and perform step-wise interactions, then retrospectively deriving high-quality tasks for trajectory-level exploration. A trajectory reward model ensures the quality of generated trajectories. Training GUI agents with OS-Genesis significantly improves their performance on challenging online benchmarks.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper helps create better computer-controlled robots that can work like humans. Right now, it’s hard to get good data to train these robots because we need to either have people supervise them or make fake data. This is time-consuming and doesn’t always work well. The researchers came up with a new way to collect data called OS-Genesis. Instead of following a plan, the robots can explore their environment and figure out what they want to do next. Then, the data can be used to train the robots to perform better tasks.

Keywords

» Artificial intelligence » Synthetic data

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

by Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Hindsight Planner: a Closed-loop Few-shot Planner For Embodied Instruction Following, by Yuxiao Yang et al.

Summary of M-mad: Multidimensional Multi-agent Debate For Advanced Machine Translation Evaluation, by Zhaopeng Feng et al.

Related Posts