Summary of Calibrating Language Models with Adaptive Temperature Scaling, by Johnathan Xie et al.
Calibrating Language Models with Adaptive Temperature Scalingby Johnathan Xie, Annie S. Chen, Yoonho Lee, Eric…
Calibrating Language Models with Adaptive Temperature Scalingby Johnathan Xie, Annie S. Chen, Yoonho Lee, Eric…
The Crucial Role of Samplers in Online Direct Preference Optimizationby Ruizhe Shi, Runlong Zhou, Simon…
HybridFlow: A Flexible and Efficient RLHF Frameworkby Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu,…
VickreyFeedback: Cost-efficient Data Construction for Reinforcement Learning from Human Feedbackby Guoxi Zhang, Jiuding DuanFirst submitted…
Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inferenceby Qining Zhang, Lei…
Reward-Robust RLHF in LLMsby Yuzi Yan, Xingzhou Lou, Jialian Li, Yiping Zhang, Jian Xie, Chao…
RLHFuse: Efficient RLHF Training for Large Language Models with Inter- and Intra-Stage Fusionby Yinmin Zhong,…
From Lists to Emojis: How Format Bias Affects Model Alignmentby Xuanchang Zhang, Wei Xiong, Lichang…
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihoodby Ruoyu Wang, Jiachen Sun, Shaowei Hua, Quan FangFirst…
Quantile Regression for Distributional Reward Models in RLHFby Nicolai DorkaFirst submitted to arxiv on: 16…