Summary of C-morl: Multi-objective Reinforcement Learning Through Efficient Discovery Of Pareto Front, by Ruohong Liu et al.

C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front

by Ruohong Liu, Yuxin Pan, Linjie Xu, Lei Song, Pengcheng You, Yize Chen, Jiang Bian

First submitted to arxiv on: 3 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary In this paper, the authors propose a new approach to multi-objective reinforcement learning (MORL) that can efficiently discover the Pareto front even for unseen preferences. The dominant MORL methods currently available have limitations when handling rapidly changing preferences and integrating preferences into policy or value functions. To address these issues, the authors introduce Constrained MORL (C-MORL), a two-stage algorithm that combines constrained policy optimization with MORL. C-MORL trains multiple policies in parallel to optimize individual preferences over multiple objectives, followed by constrained optimization steps to fill gaps in the Pareto front. The authors evaluate their algorithm on various discrete and continuous control tasks, achieving consistent and superior performance compared to recent advancements in MORL methods.
Low	GrooveSquid.com (original content)	Low Difficulty Summary MORL is a way for machines to learn how to make decisions based on multiple goals. Currently, most MORL methods have limitations when it comes to handling changing preferences or combining these preferences with the machine’s decision-making process. The authors of this paper propose a new approach that can help overcome these limitations. They call their method Constrained MORL (C-MORL). C-MORL works by training multiple policies at the same time, each one trying to achieve its own specific goal. Then, it uses another step to fill in any gaps in the “Pareto front” – a way to visualize all the possible trade-offs between these goals.

Keywords

* Artificial intelligence * Optimization * Reinforcement learning

C-MORL: Multi-Objective Reinforcement Learning through Efficient Discovery of Pareto Front

by Ruohong Liu, Yuxin Pan, Linjie Xu, Lei Song, Pengcheng You, Yize Chen, Jiang Bian

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Mitigating Downstream Model Risks Via Model Provenance, by Keyu Wang et al.

Summary of Seal: Semantic-augmented Imitation Learning Via Language Model, by Chengyang Gu et al.

Related Posts