Summary of State Space Model For New-generation Network Alternative to Transformers: a Survey, by Xiao Wang et al.
State Space Model for New-Generation Network Alternative to Transformers: A Surveyby Xiao Wang, Shiao Wang,…
State Space Model for New-Generation Network Alternative to Transformers: A Surveyby Xiao Wang, Shiao Wang,…
BERT-LSH: Reducing Absolute Compute For Attentionby Zezheng Li, Kingston YipFirst submitted to arxiv on: 12…
Inheritune: Training Smaller Yet More Attentive Language Modelsby Sunny Sanyal, Ravid Shwartz-Ziv, Alexandros G. Dimakis,…
LLoCO: Learning Long Contexts Offlineby Sijun Tan, Xiuyu Li, Shishir Patil, Ziyang Wu, Tianjun Zhang,…
Localizing Moments of Actions in Untrimmed Videos of Infants with Autism Spectrum Disorderby Halil Ismail…
SqueezeAttention: 2D Management of KV-Cache in LLM Inference via Layer-wise Optimal Budgetby Zihao Wang, Bin…
Enhancing IoT Intelligence: A Transformer-based Reinforcement Learning Methodologyby Gaith Rjoub, Saidul Islam, Jamal Bentahar, Mohammed…
Graph Neural Networks for Electric and Hydraulic Data Fusion to Enhance Short-term Forecasting of Pumped-storage…
On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformersby Cai Zhou,…
Optimizing the Deployment of Tiny Transformers on Low-Power MCUsby Victor J.B. Jung, Alessio Burrello, Moritz…