Summary of Effective Exploration Based on the Structural Information Principles, by Xianghua Zeng et al.
Effective Exploration Based on the Structural Information Principlesby Xianghua Zeng, Hao Peng, Angsheng LiFirst submitted…
Effective Exploration Based on the Structural Information Principlesby Xianghua Zeng, Hao Peng, Angsheng LiFirst submitted…
Q-WSL: Optimizing Goal-Conditioned RL with Weighted Supervised Learning via Dynamic Programmingby Xing Lei, Xuetao Zhang,…
Flipping-based Policy for Chance-Constrained Markov Decision Processesby Xun Shen, Shuo Jiang, Akifumi Wachi, Kaumune Hashimoto,…
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hackby Leo McKee-Reid, Christoph…
Quality Diversity Imitation Learningby Zhenglin Wan, Xingrui Yu, David Mark Bossens, Yueming Lyu, Qing Guo,…
Solving robust MDPs as a sequence of static RL problemsby Adil Zouitine, Matthieu Geist, Emmanuel…
RL, but don’t do anything I wouldn’t doby Michael K. Cohen, Marcus Hutter, Yoshua Bengio,…
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learningby Claire Chen, Shuze Liu, Shangtong ZhangFirst…
Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewardsby Zhaohui Jiang, Xuening Feng, Paul Weng,…
LLMs Are In-Context Bandit Reinforcement Learnersby Giovanni Monea, Antoine Bosselut, Kianté Brantley, Yoav ArtziFirst submitted…