Summary of Would I Lie to You? Inference Time Alignment Of Language Models Using Direct Preference Heads, by Avelina Asada Hadji-kyriacou and Ognjen Arandjelovic
Would I Lie To You? Inference Time Alignment of Language Models using Direct Preference Headsby…