Summary of Unitnorm: Rethinking Normalization For Transformers in Time Series, by Nan Huang et al.
UnitNorm: Rethinking Normalization for Transformers in Time Seriesby Nan Huang, Christian Kümmerle, Xiang ZhangFirst submitted…
UnitNorm: Rethinking Normalization for Transformers in Time Seriesby Nan Huang, Christian Kümmerle, Xiang ZhangFirst submitted…
Infinite Limits of Multi-head Transformer Dynamicsby Blake Bordelon, Hamza Tahir Chaudhry, Cengiz PehlevanFirst submitted to…
Models That Prove Their Own Correctnessby Noga Amit, Shafi Goldwasser, Orr Paradise, Guy RothblumFirst submitted…
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidenceby Abhinav Patil,…
Sequence Length Scaling in Vision Transformers for Scientific Images on Frontierby Aristeidis Tsaris, Chengming Zhang,…
MLPs Learn In-Context on Regression and Classification Tasksby William L. Tong, Cengiz PehlevanFirst submitted to…
Spectraformer: A Unified Random Feature Framework for Transformerby Duke Nguyen, Aditya Joshi, Flora SalimFirst submitted…
iVideoGPT: Interactive VideoGPTs are Scalable World Modelsby Jialong Wu, Shaofeng Yin, Ningya Feng, Xu He,…
The Buffer Mechanism for Multi-Step Information Reasoning in Language Modelsby Zhiwei Wang, Yunji Wang, Zhongwang…
Towards Better Understanding of In-Context Learning Ability from In-Context Uncertainty Quantificationby Shang Liu, Zhongze Cai,…