Reinforcement learning – Page 70

July 13, 2025

Summary of Adaptive Segment-level Reward: Bridging the Gap Between Action and Reward Space in Alignment, by Yanshi Li et al.

Adaptive Segment-level Reward: Bridging the Gap Between Action and Reward Space in Alignmentby Yanshi Li,…

July 13, 2025

Summary of Cycleresearcher: Improving Automated Research Via Automated Review, by Yixuan Weng et al.

CycleResearcher: Improving Automated Research via Automated Reviewby Yixuan Weng, Minjun Zhu, Guangsheng Bao, Hongbo Zhang,…

July 13, 2025

Summary of Beyond the Boundaries Of Proximal Policy Optimization, by Charlie B. Tan et al.

Beyond the Boundaries of Proximal Policy Optimizationby Charlie B. Tan, Edan Toledo, Benjamin Ellis, Jakob…

July 13, 2025

Summary of Token-level Proximal Policy Optimization For Query Generation, by Yichen Ouyang et al.

Token-level Proximal Policy Optimization for Query Generationby Yichen Ouyang, Lu Wang, Fangkai Yang, Pu Zhao,…

July 13, 2025

Summary of Statistical Guarantees For Lifelong Reinforcement Learning Using Pac-bayesian Theory, by Zhi Zhang et al.

Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theoryby Zhi Zhang, Chris Chow, Yasi Zhang,…

July 13, 2025

Summary of Uncertainty-based Offline Variational Bayesian Reinforcement Learning For Robustness Under Diverse Data Corruptions, by Rui Yang et al.

Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptionsby Rui Yang, Jie…

July 13, 2025

Summary of Stepcountjitai: Simulation Environment For Rl with Application to Physical Activity Adaptive Intervention, by Karine Karine and Benjamin M. Marlin

StepCountJITAI: simulation environment for RL with application to physical activity adaptive interventionby Karine Karine, Benjamin…

July 13, 2025

Summary of Hierarchical Preference Optimization: Learning to Achieve Goals Via Feasible Subgoals Prediction, by Utsav Singh et al.

Hierarchical Preference Optimization: Learning to achieve goals via feasible subgoals predictionby Utsav Singh, Souradip Chakraborty,…

July 13, 2025

Summary of Compositional Automata Embeddings For Goal-conditioned Reinforcement Learning, by Beyazit Yalcinkaya et al.

Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learningby Beyazit Yalcinkaya, Niklas Lauffer, Marcell Vazquez-Chanlatte, Sanjit A.…

July 13, 2025

Summary of Earl-bo: Reinforcement Learning For Multi-step Lookahead, High-dimensional Bayesian Optimization, by Mujin Cheon and Jay H. Lee and Dong-yeun Koh and Calvin Tsay

EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimizationby Mujin Cheon, Jay H. Lee, Dong-Yeun…