Summary of In-context Learning and Occam’s Razor, by Eric Elmoznino et al.
In-context learning and Occam’s razorby Eric Elmoznino, Tom Marty, Tejas Kasetty, Leo Gagnon, Sarthak Mittal,…
In-context learning and Occam’s razorby Eric Elmoznino, Tom Marty, Tejas Kasetty, Leo Gagnon, Sarthak Mittal,…
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokensby Lijie Fan, Tianhong Li, Siyang Qin,…
Enhancing Generalization in Sparse Mixture of Experts Models: The Case for Increased Expert Activation in…
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMsby Tianyu Guo, Druv Pai, Yu Bai,…
Boosting Imperceptibility of Stable Diffusion-based Adversarial Examples Generation with Momentumby Nashrah Haque, Xiang Li, Zhehui…
In-context KV-Cache Eviction for LLMs via Attention-Gateby Zihao Zeng, Bokai Lin, Tianqi Hou, Hao Zhang,…
Self-Supervised Learning of Disentangled Representations for Multivariate Time-Seriesby Ching Chang, Chiao-Tung Chan, Wei-Yao Wang, Wen-Chih…
MoH: Multi-Head Attention as Mixture-of-Head Attentionby Peng Jin, Bo Zhu, Li Yuan, Shuicheng YanFirst submitted…
The Fair Language Model Paradoxby Andrea Pinto, Tomer Galanti, Randall BalestrieroFirst submitted to arxiv on:…
DySpec: Faster Speculative Decoding with Dynamic Token Tree Structureby Yunfan Xiong, Ruoyu Zhang, Yanzeng Li,…