Summary of Offline Imitation Learning From Multiple Baselines with Applications to Compiler Optimization, by Teodor V. Marinov et al.

Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization

by Teodor V. Marinov, Alekh Agarwal, Mircea Trofin

First submitted to arxiv on: 28 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper addresses a Reinforcement Learning (RL) problem where a set of trajectories is collected from K baseline policies, each with its strengths in specific parts of the state space. The goal is to learn a policy that outperforms the best combination of baselines on the entire state space. To achieve this, the authors propose an imitation learning-based algorithm, which they show has a sample complexity bound on its accuracy and prove to be minimax optimal with a matching lower bound. Additionally, the algorithm is applied in the context of machine learning-guided compiler optimization to learn policies for inlining programs, aiming to create a small binary. The results demonstrate that the proposed approach can outperform an initial policy learned via standard RL through a few iterations.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper explores how to make computers better at doing things on their own by learning from other ways of doing things. Imagine you have many different strategies for solving a problem, and each one is good at certain parts of the solution. The goal is to find a new way that combines the best parts of all these strategies to solve the whole problem really well. The authors developed an algorithm that can do this by copying the best parts from other ways of doing things. They tested it in a special area called compiler optimization, where computers optimize how they use computer code. The results show that their approach is better than what’s normally done and can even improve over time.

Keywords

» Artificial intelligence » Machine learning » Optimization » Reinforcement learning

Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization

by Teodor V. Marinov, Alekh Agarwal, Mircea Trofin

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Dual-personalizing Adapter For Federated Foundation Models, by Yiyuan Yang et al.

Summary of Regression with Multi-expert Deferral, by Anqi Mao et al.

Related Posts