Softmax – Page 12 – GrooveSquid.com

July 13, 2025

MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding Length Extrapolationby Weiguo GaoFirst submitted to arxiv…

July 13, 2025

Improving Sampling Methods for Fine-tuning SentenceBERT in Text Streamsby Cristiano Mesquita Garcia, Alessandro Lameiras Koerich,…

July 13, 2025

Logits of API-Protected LLMs Leak Proprietary Informationby Matthew Finlayson, Xiang Ren, Swabha SwayamdiptaFirst submitted to…

July 13, 2025

Uncertainty Quantification for cross-subject Motor Imagery classificationby Prithviraj Manivannan, Ivo Pascal de Jong, Matias Valdenegro-Toro,…

July 13, 2025

Implicit Regularization of Gradient Flow on One-Layer Softmax Attentionby Heejune Sheen, Siyu Chen, Tianhao Wang,…

July 13, 2025

On the Origins of Linear Representations in Large Language Modelsby Yibo Jiang, Goutham Rajendran, Pradeep…

July 13, 2025

TaylorShift: Shifting the Complexity of Self-Attention from Squared to Linear (and Back) using Taylor-Softmaxby Tobias…

July 13, 2025

A Theoretical Analysis of Self-Supervised Learning for Vision Transformersby Yu Huang, Zixin Wen, Yuejie Chi,…

July 13, 2025

Indirectly Parameterized Concrete Autoencodersby Alfred Nilsson, Klas Wijk, Sai bharath chandra Gutha, Erik Englesson, Alexandra…

July 13, 2025

Training Dynamics of Multi-Head Softmax Attention for In-Context Learning: Emergence, Convergence, and Optimalityby Siyu Chen,…