Loading Now

Summary of Optimal Transport-assisted Risk-sensitive Q-learning, by Zahra Shahrooei and Ali Baheri


Optimal Transport-Assisted Risk-Sensitive Q-Learning

by Zahra Shahrooei, Ali Baheri

First submitted to arxiv on: 17 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Systems and Control (eess.SY)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper proposes a risk-sensitive Q-learning algorithm that leverages optimal transport theory to enhance agent safety. This approach optimizes the policy’s expected return while minimizing the Wasserstein distance between the policy’s stationary distribution and a predefined risk distribution, which encapsulates safety preferences from domain experts. The method is validated in a Gridworld environment, demonstrating faster convergence to a stable policy and reduced visits to risky states compared to traditional Q-learning.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper helps make artificial intelligence safer by creating decision-making policies that avoid taking risks. It uses a new algorithm that combines two existing methods: reinforcement learning (which makes decisions) and optimal transport theory (which measures distances). The goal is to find the best balance between getting good results and staying safe. The researchers tested this approach in a virtual environment called Gridworld and found that it was more effective at avoiding risks and reaching stable solutions than traditional methods.

Keywords

» Artificial intelligence  » Reinforcement learning