Summary of Order-optimal Instance-dependent Bounds For Offline Reinforcement Learning with Preference Feedback, by Zhirui Chen and Vincent Y. F. Tan
Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedbackby Zhirui Chen, Vincent Y. F.…