Summary of Enhanced Structured State Space Models Via Grouped Fir Filtering and Attention Sink Mechanisms, by Tian Meng et al.
Enhanced Structured State Space Models via Grouped FIR Filtering and Attention Sink Mechanismsby Tian Meng,…
Enhanced Structured State Space Models via Grouped FIR Filtering and Attention Sink Mechanismsby Tian Meng,…
MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Predictionby Seongju Lee, Junseok Lee, Yeonguk Yu,…
An Explainable Vision Transformer with Transfer Learning Combined with Support Vector Machine Based Efficient Drought…
Contrastive Factor Analysisby Zhibin Duan, Tiansheng Wen, Yifei Wang, Chen Zhu, Bo Chen, Mingyuan ZhouFirst…
Evaluating Long Range Dependency Handling in Code Generation Models using Multi-Step Key Retrievalby Yannick Assogba,…
Palu: Compressing KV-Cache with Low-Rank Projectionby Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin, Chong-Yan Chen, Yu-Fang…
Interpretable Pre-Trained Transformers for Heart Time-Series Databy Harry J. Davies, James Monsen, Danilo P. MandicFirst…
A2SF: Accumulative Attention Scoring with Forgetting Factor for Token Pruning in Transformer Decoderby Hyun-rae Jo,…
MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarityby Kanghyun Choi, Hye…
Multiscale Representation Enhanced Temporal Flow Fusion Model for Long-Term Workload Forecastingby Shiyu Wang, Zhixuan Chu,…