Self attention – Page 34 – GrooveSquid.com

July 13, 2025

Summary of Chunkattention: Efficient Self-attention with Prefix-aware Kv Cache and Two-phase Partition, by Lu Ye et al.

ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partitionby Lu Ye, Ze Tao, Yong…

July 13, 2025

Summary of From Self-attention to Markov Models: Unveiling the Dynamics Of Generative Transformers, by M. Emrullah Ildiz et al.

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformersby M. Emrullah Ildiz, Yixiao…

July 13, 2025

Summary of Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-series Dependencies and Intra-series Variations Modeling, by Guoqi Yu et al.

Revitalizing Multivariate Time Series Forecasting: Learnable Decomposition with Inter-Series Dependencies and Intra-Series Variations Modelingby Guoqi…

July 13, 2025

Summary of An End-to-end Attention-based Approach For Learning on Graphs, by David Buterez et al.

An end-to-end attention-based approach for learning on graphsby David Buterez, Jon Paul Janet, Dino Oglic,…

July 13, 2025

Summary of Transformers, Parallel Computation, and Logarithmic Depth, by Clayton Sanford et al.

Transformers, parallel computation, and logarithmic depthby Clayton Sanford, Daniel Hsu, Matus TelgarskyFirst submitted to arxiv…

July 13, 2025

Summary of Investigating Out-of-distribution Generalization Of Gnns: An Architecture Perspective, by Kai Guo et al.

Investigating Out-of-Distribution Generalization of GNNs: An Architecture Perspectiveby Kai Guo, Hongzhi Wen, Wei Jin, Yaming…

July 13, 2025

Summary of The I/o Complexity Of Attention, or How Optimal Is Flash Attention?, by Barna Saha et al.

The I/O Complexity of Attention, or How Optimal is Flash Attention?by Barna Saha, Christopher YeFirst…

July 13, 2025

Summary of Mesoscale Traffic Forecasting For Real-time Bottleneck and Shockwave Prediction, by Raphael Chekroun et al.

Mesoscale Traffic Forecasting for Real-Time Bottleneck and Shockwave Predictionby Raphael Chekroun, Han Wang, Jonathan Lee,…

July 13, 2025

Summary of Implicit Bias and Fast Convergence Rates For Self-attention, by Bhavya Vasudeva et al.

Implicit Bias and Fast Convergence Rates for Self-attentionby Bhavya Vasudeva, Puneesh Deora, Christos ThrampoulidisFirst submitted…

July 13, 2025

Summary of Examining Modality Incongruity in Multimodal Federated Learning For Medical Vision and Language-based Disease Detection, by Pramit Saha et al.

Examining Modality Incongruity in Multimodal Federated Learning for Medical Vision and Language-based Disease Detectionby Pramit…