Summary of Correcting the Mythos Of Kl-regularization: Direct Alignment Without Overoptimization Via Chi-squared Preference Optimization, by Audrey Huang et al.
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimizationby Audrey Huang,…