Summary of Openrlhf: An Easy-to-use, Scalable and High-performance Rlhf Framework, by Jian Hu et al.
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Frameworkby Jian Hu, Xibin Wu, Zilin Zhu, Xianyu,…
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Frameworkby Jian Hu, Xibin Wu, Zilin Zhu, Xianyu,…
Towards Modular LLMs by Building and Reusing a Library of LoRAsby Oleksiy Ostapenko, Zhan Su,…
Graph Feedback Bandits with Similar Armsby Han Qi, Guo Fei, Li ZhuFirst submitted to arxiv…
Accelerating Multilevel Markov Chain Monte Carlo Using Machine Learning Modelsby Sohail Reddy, Hillary FairbanksFirst submitted…
Wind Power Prediction across Different Locations using Deep Domain Adaptive Learningby Md Saiful Islam Sajol,…
Trustworthy Actionable Perturbationsby Jesse Friedbaum, Sudarshan Adiga, Ravi TandonFirst submitted to arxiv on: 18 May…
Learning from Imperfect Human Feedback: a Tale from Corruption-Robust Duelingby Yuwei Cheng, Fan Yao, Xuefeng…
Towards Robust Policy: Enhancing Offline Reinforcement Learning with Adversarial Attacks and Defensesby Thanh Nguyen, Tung…
The Power of Active Multi-Task Learning in Reinforcement Learning from Human Feedbackby Ruitao Chen, Liwei…
OTLP: Output Thresholding Using Mixed Integer Linear Programmingby Baran Koseoglu, Luca Traverso, Mohammed Topiwalla, Egor…