Summary of Unintended Impacts Of Llm Alignment on Global Representation, by Michael J. Ryan et al.
Unintended Impacts of LLM Alignment on Global Representationby Michael J. Ryan, William Held, Diyi YangFirst…
Unintended Impacts of LLM Alignment on Global Representationby Michael J. Ryan, William Held, Diyi YangFirst…
Don’t Just Say “I don’t know”! Self-aligning Large Language Models for Responding to Unknown Questions…
COMPASS: Computational Mapping of Patient-Therapist Alliance Strategies with Language Modelingby Baihan Lin, Djallel Bouneffouf, Yulia…
Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMsby Arash…
T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitchingby Zizheng Pan, Bohan Zhuang, De-An…
Inductive Graph Alignment Prompt: Bridging the Gap between Graph Pre-training and Inductive Fine-tuning From Spectral…
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positiveby Arka Pal, Deep Karkhanis, Samuel Dooley,…
Modality-Aware Integration with Large Language Models for Knowledge-based Visual Question Answeringby Junnan Dong, Qinggang Zhang,…
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!by Zhanhui Zhou, Jie Liu, Zhichen…
Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Trainingby Leo Hyun…