Summary of Laser: Learning to Adaptively Select Reward Models with Multi-armed Bandits, by Duy Nguyen et al.
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Banditsby Duy Nguyen, Archiki Prasad, Elias…
LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Banditsby Duy Nguyen, Archiki Prasad, Elias…
MIO: A Foundation Model on Multimodal Tokensby Zekun Wang, King Zhu, Chunpu Xu, Wangchunshu Zhou,…
Unlocking Memorization in Large Language Models with Dynamic Soft Promptingby Zhepeng Wang, Runxue Bao, Yawen…
ASFT: Aligned Supervised Fine-Tuning through Absolute Likelihoodby Ruoyu Wang, Jiachen Sun, Shaowei Hua, Quan FangFirst…
On the Generalizability of Foundation Models for Crop Type Mappingby Yi-Chia Chang, Adam J. Stewart,…
Transformer with Controlled Attention for Synchronous Motion Captioningby Karim Radouane, Sylvie Ranwez, Julien Lagarde, Andon…
Adversarial Attacks on Data Attributionby Xinhe Wang, Pingbang Hu, Junwei Deng, Jiaqi W. MaFirst submitted…
LLMs Will Always Hallucinate, and We Need to Live With Thisby Sourav Banerjee, Ayushi Agarwal,…
LoCa: Logit Calibration for Knowledge Distillationby Runming Yang, Taiqiang Wu, Yujiu YangFirst submitted to arxiv…
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Samplingby Kaiwen Zheng,…