Summary of Control in Stochastic Environment with Delays: a Model-based Reinforcement Learning Approach, by Zhiyuan Yao et al.
Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach
by Zhiyuan Yao, Ionut Florescu, Chihoon Lee
First submitted to arxiv on: 1 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Systems and Control (eess.SY)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel reinforcement learning approach for control problems involving delayed feedback. The method leverages stochastic planning, unlike previous deterministic approaches. This innovation enables the incorporation of risk preference into policy optimization. The authors demonstrate that this formulation can recover the optimal policy for problems with deterministic transitions. A comparison is made with two prior methods from literature. To illustrate its features, the methodology is applied to simple tasks. Finally, the performance of the methods is evaluated in controlling multiple Atari games. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper introduces a new way for computers to learn and make decisions in situations where it takes time to get feedback. The approach uses “stochastic planning” which means the computer considers different possible outcomes and their probabilities. This allows the computer to make decisions based on how much risk is involved. The authors show that this method can be used to find the best solution even when the outcome is certain. They compare their method with two other methods from previous research and test it by controlling simple games and more complex Atari games. |
Keywords
* Artificial intelligence * Optimization * Reinforcement learning