Summary of Large Language Models As Markov Chains, by Oussama Zekri et al.
Large Language Models as Markov Chainsby Oussama Zekri, Ambroise Odonnat, Abdelhakim Benechehab, Linus Bleistein, Nicolas…
Large Language Models as Markov Chainsby Oussama Zekri, Ambroise Odonnat, Abdelhakim Benechehab, Linus Bleistein, Nicolas…
Grounding Large Language Models In Embodied Environment With Imperfect World Modelsby Haolan Liu, Jishen ZhaoFirst…
LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Modelby Duy M. H. Nguyen, Nghiem T.…
How to Train Long-Context Language Models (Effectively)by Tianyu Gao, Alexander Wettig, Howard Yen, Danqi ChenFirst…
CodeJudge: Evaluating Code Generation with Large Language Modelsby Weixi Tong, Tianyi ZhangFirst submitted to arxiv…
Automated Red Teaming with GOAT: the Generative Offensive Agent Testerby Maya Pavlova, Erik Brinkman, Krithika…
HelpSteer2-Preference: Complementing Ratings with Preferencesby Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen,…
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Modelsby Can Demircan, Tankred Saanum, Akshay…
Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inferenceby Ke Yi, Zengke Liu, Jianwei…
Scaling Optimal LR Across Token Horizonsby Johan Bjorck, Alon Benhaim, Vishrav Chaudhary, Furu Wei, Xia…