Summary of Decoupled Alignment For Robust Plug-and-play Adaptation, by Haozheng Luo et al.
Decoupled Alignment for Robust Plug-and-Play Adaptationby Haozheng Luo, Jiahao Yu, Wenxin Zhang, Jialong Li, Jerry…
Decoupled Alignment for Robust Plug-and-Play Adaptationby Haozheng Luo, Jiahao Yu, Wenxin Zhang, Jialong Li, Jerry…
InstructionCP: A fast approach to transfer Large Language Models into target languageby Kuang-Ming Chen, Hung-yi…
Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT…
Exploring the LLM Journey from Cognition to Expression with Linear Representationsby Yuzi Yan, Jialian Li,…
360Zhinao Technical Reportby 360Zhinao TeamFirst submitted to arxiv on: 22 May 2024CategoriesMain: Computation and Language…
Leveraging Human Revisions for Improving Text-to-Layout Modelsby Amber Xie, Chin-Yi Cheng, Forrest Huang, Yang LiFirst…
More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthinessby Aaron J. Li,…
Mapping Social Choice Theory to RLHFby Jessica Dai, Eve FleisigFirst submitted to arxiv on: 19…
MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answeringby Janak Kapuriya, Chhavi Kirtani, Apoorv Singh, Jay…
InternLM2 Technical Reportby Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen,…