Loading Now

Summary of Os-genesis: Automating Gui Agent Trajectory Construction Via Reverse Task Synthesis, by Qiushi Sun et al.


OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

by Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu

First submitted to arxiv on: 27 Dec 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes OS-Genesis, a novel pipeline for collecting high-quality trajectory data to train Graphical User Interface (GUI) agents powered by Vision-Language Models (VLMs). The current methods rely on human supervision or synthetic data generation, which are resource-intensive or unable to guarantee data quality. OS-Genesis reverses the conventional process by enabling agents to perceive environments and perform step-wise interactions, then retrospectively deriving high-quality tasks for trajectory-level exploration. A trajectory reward model ensures the quality of generated trajectories. Training GUI agents with OS-Genesis significantly improves their performance on challenging online benchmarks.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper helps create better computer-controlled robots that can work like humans. Right now, it’s hard to get good data to train these robots because we need to either have people supervise them or make fake data. This is time-consuming and doesn’t always work well. The researchers came up with a new way to collect data called OS-Genesis. Instead of following a plan, the robots can explore their environment and figure out what they want to do next. Then, the data can be used to train the robots to perform better tasks.

Keywords

» Artificial intelligence  » Synthetic data