Summary of Uniformly Safe Rl with Objective Suppression For Multi-constraint Safety-critical Applications, by Zihan Zhou et al.
Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications
by Zihan Zhou, Jonathan Booher, Khashayar Rohanimanesh, Wei Liu, Aleksandr Petiushko, Animesh Garg
First submitted to arxiv on: 23 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a novel approach called Objective Suppression to address the issue of dangerous behaviors in long-tail states during safe reinforcement learning tasks. The current widely adopted CMDP model constrains risks only in expectation, allowing for hazardous actions in rare but critical scenarios. To mitigate this, the authors introduce a Uniformly Constrained MDP (UCMDP) model that imposes constraints on all reachable states. This is combined with Objective Suppression, which adaptively suppresses task rewards according to a safety critic. The method is evaluated in two multi-constraint safety domains, including an autonomous driving scenario where incorrect behavior can lead to catastrophic consequences. Results show that the proposed approach can match baseline performance while significantly reducing constraint violations. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper solves a big problem in artificial intelligence called safe reinforcement learning. Right now, many AI systems are designed to make good decisions most of the time, but they might still do something bad once in a while. This is a big issue because sometimes those bad actions can have really bad consequences. The authors propose a new way to make sure AI systems don’t do anything dangerous by mistake. They test this approach in two real-world scenarios, including one where self-driving cars need to avoid accidents. The results show that their method can work just as well as other approaches while being much safer. |
Keywords
* Artificial intelligence * Reinforcement learning