Summary of Arithmetic Control Of Llms For Diverse User Preferences: Directional Preference Alignment with Multi-objective Rewards, by Haoxiang Wang et al.
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewardsby Haoxiang…