Summary of Energy-based Preference Model Offers Better Offline Alignment Than the Bradley-terry Preference Model, by Yuzhong Hong et al.
Energy-Based Preference Model Offers Better Offline Alignment than the Bradley-Terry Preference Modelby Yuzhong Hong, Hanshan…