Summary of Mapping Social Choice Theory to Rlhf, by Jessica Dai et al.
Mapping Social Choice Theory to RLHFby Jessica Dai, Eve FleisigFirst submitted to arxiv on: 19…
Mapping Social Choice Theory to RLHFby Jessica Dai, Eve FleisigFirst submitted to arxiv on: 19…
Uncertainty-Based Abstention in LLMs Improves Safety and Reduces Hallucinationsby Christian Tomani, Kamalika Chaudhuri, Ivan Evtimov,…
Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignmentby Yuu Jinnai, Tetsuro…
Dialectical Alignment: Resolving the Tension of 3H and Security Threats of LLMsby Shu Yang, Jiayuan…
InternLM2 Technical Reportby Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen,…
Reinforcement Learning from Reflective Feedback (RLRF): Aligning and Improving LLMs via Fine-Grained Self-Reflectionby Kyungjae Lee,…
Fine-tuning vs Prompting, Can Language Models Understand Human Values?by Pingwei SunFirst submitted to arxiv on:…
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewardsby Wei Shen, Xiaoying Zhang, Yuanshun Yao,…
Enhancing Image Caption Generation Using Reinforcement Learning with Human Feedbackby Adarsh N L, Arun P…
MedAide: Leveraging Large Language Models for On-Premise Medical Assistance on Edge Devicesby Abdul Basit, Khizar…