Summary of Do As I Do (safely): Mitigating Task-specific Fine-tuning Risks in Large Language Models, by Francisco Eiras et al.
Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Modelsby Francisco Eiras,…
Do as I do (Safely): Mitigating Task-Specific Fine-tuning Risks in Large Language Modelsby Francisco Eiras,…
TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR…
How Does Distribution Matching Help Domain Generalization: An Information-theoretic Analysisby Yuxin Dong, Tieliang Gong, Hong…
Bootstrapping Language Models with DPO Implicit Rewardsby Changyu Chen, Zichen Liu, Chao Du, Tianyu Pang,…
Exploring the Spectrum of Visio-Linguistic Compositionality and Recognitionby Youngtaek Oh, Pyunghwan Ahn, Jinhyung Kim, Gwangmo…
Understanding Jailbreak Success: A Study of Latent Space Dynamics in Large Language Modelsby Sarah Ball,…
HelpSteer2: Open-source dataset for training top-performing reward modelsby Zhilin Wang, Yi Dong, Olivier Delalleau, Jiaqi…
PAL: Pluralistic Alignment Framework for Learning from Heterogeneous Preferencesby Daiwei Chen, Yi Chen, Aniket Rege,…
Grounding Multimodal Large Language Models in Actionsby Andrew Szot, Bogdan Mazoure, Harsh Agrawal, Devon Hjelm,…
OPTune: Efficient Online Preference Tuningby Lichang Chen, Jiuhai Chen, Chenxi Liu, John Kirchenbauer, Davit Soselia,…