Summary of Harnessing Causality in Reinforcement Learning with Bagged Decision Times, by Daiqi Gao et al.
Harnessing Causality in Reinforcement Learning With Bagged Decision Times
by Daiqi Gao, Hsin-Yu Lai, Predrag Klasnja, Susan A. Murphy
First submitted to arxiv on: 18 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel reinforcement learning (RL) approach is developed to tackle problems with bagged decision times, where multiple actions within a sequence impact a single reward observed at the end of the sequence. This is particularly relevant for applications like mobile health, where daily activity suggestions collectively affect user commitment. To handle non-Markovian transitions within a bag, an expert-provided causal directed acyclic graph (DAG) is utilized to construct states as dynamical Bayesian sufficient statistics. The problem is formulated as a periodic Markov decision process (MDP), and a generalized online RL algorithm based on Bellman equations for stationary MDPs is proposed. The constructed state achieves the maximal optimal value function among all state constructions for a periodic MDP, and the approach is evaluated on testbed variants built from real data in a mobile health clinical trial. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Reinforcement learning helps robots make good decisions by trying different actions and seeing how they work out. But what if multiple actions are connected, like daily activities that affect overall fitness? This paper figures out how to use special graphs to understand these connections and make better decisions. It’s like planning a day to get fit – you need to consider all the little things you do, like exercise and diet. The paper shows how this works by testing it on real data from a health study. |
Keywords
* Artificial intelligence * Reinforcement learning