Summary of Benign Overfitting in Single-head Attention, by Roey Magen et al.
Benign Overfitting in Single-Head Attentionby Roey Magen, Shuning Shang, Zhiwei Xu, Spencer Frei, Wei Hu,…
Benign Overfitting in Single-Head Attentionby Roey Magen, Shuning Shang, Zhiwei Xu, Spencer Frei, Wei Hu,…
Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modelingby Yingfa Chen, Xinrong Zhang,…
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learningby Xiyao Wang,…
Benign Overfitting for Regression with Trained Two-Layer ReLU Networksby Junhyung Park, Patrick Bloebaum, Shiva Prasad…
QT-DoG: Quantization-aware Training for Domain Generalizationby Saqib Javed, Hieu Le, Mathieu SalzmannFirst submitted to arxiv…
Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusionby Enrico Ventura, Beatrice…
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipeby Yuxin Xiao, Shujian Zhang, Wenxuan Zhou,…
Granular Ball Twin Support Vector Machineby A. Quadir, M. Sajid, M. TanveerFirst submitted to arxiv…
Dynamic Post-Hoc Neural Ensemblersby Sebastian Pineda Arango, Maciej Janowski, Lennart Purucker, Arber Zela, Frank Hutter,…
Collaborative and Efficient Personalization with Mixtures of Adaptorsby Abdulla Jasem Almansoori, Samuel Horváth, Martin TakáčFirst…