Summary of Creativity Has Left the Chat: the Price Of Debiasing Language Models, by Behnam Mohammadi
Creativity Has Left the Chat: The Price of Debiasing Language Modelsby Behnam MohammadiFirst submitted to…
Creativity Has Left the Chat: The Price of Debiasing Language Modelsby Behnam MohammadiFirst submitted to…
Optimizing Autonomous Driving for Safety: A Human-Centric Approach with LLM-Enhanced RLHFby Yuan Sun, Navid Salami…
Direct Alignment of Language Models via Quality-Aware Self-Refinementby Runsheng Yu, Yong Wang, Xiaoqi Jiao, Youzhi…
InstructionCP: A fast approach to transfer Large Language Models into target languageby Kuang-Ming Chen, Hung-yi…
Hybrid Preference Optimization: Augmenting Direct Preference Optimization with Auxiliary Objectivesby Anirudhan Badrinath, Prabhat Agarwal, Jiajing…
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT…
360Zhinao Technical Reportby 360Zhinao TeamFirst submitted to arxiv on: 22 May 2024CategoriesMain: Computation and Language…
Leveraging Human Revisions for Improving Text-to-Layout Modelsby Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang LiFirst…
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthinessby Aaron J. Li,…
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answeringby Janak Kapuriya, Chhavi Kirtani, Apoorv Singh, Jay…