Summary of Do Transformer World Models Give Better Policy Gradients?, by Michel Ma et al.
Do Transformer World Models Give Better Policy Gradients?by Michel Ma, Tianwei Ni, Clement Gehring, Pierluca…
Do Transformer World Models Give Better Policy Gradients?by Michel Ma, Tianwei Ni, Clement Gehring, Pierluca…
Interactive Symbolic Regression through Offline Reinforcement Learning: A Co-Design Frameworkby Yuan Tian, Wenqi Zhou, Michele…
Three Pathways to Neurosymbolic Reinforcement Learning with Interpretable Model and Policy Networksby Peter Graf, Patrick…
Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Drivingby Wensheng…
FlowPG: Action-constrained Policy Gradient with Normalizing Flowsby Janaka Chathuranga Brahmanage, Jiajing Ling, Akshat KumarFirst submitted…
Learning mirror maps in policy mirror descentby Carlo Alfano, Sebastian Towers, Silvia Sapora, Chris Lu,…
Context in Public Health for Underserved Communities: A Bayesian Approach to Online Restless Banditsby Biyonka…
Explaining Learned Reward Functions with Counterfactual Trajectoriesby Jan Wehner, Frans Oliehoek, Luciano Cavalcante SiebertFirst submitted…
Learning by Doing: An Online Causal Reinforcement Learning Framework with Causal-Aware Policyby Ruichu Cai, Siyang…
Code as Reward: Empowering Reinforcement Learning with VLMsby David Venuto, Sami Nur Islam, Martin Klissarov,…