Summary of Towards Global Optimality For Practical Average Reward Reinforcement Learning Without Mixing Time Oracles, by Bhrij Patel et al.
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oraclesby Bhrij Patel,…