Summary of Quest: Query-aware Sparsity For Efficient Long-context Llm Inference, by Jiaming Tang et al.
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inferenceby Jiaming Tang, Yilong Zhao, Kan Zhu, Guangxuan…
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inferenceby Jiaming Tang, Yilong Zhao, Kan Zhu, Guangxuan…
L4GM: Large 4D Gaussian Reconstruction Modelby Jiawei Ren, Kevin Xie, Ashkan Mirzaei, Hanxue Liang, Xiaohui…
Large Language Models as Interpolated and Extrapolated Event Predictorsby Libo Zhang, Yue NingFirst submitted to…
ECGMamba: Towards Efficient ECG Classification with BiSSMby Yupeng Qiang, Xunde Dong, Xiuling Liu, Yang Yang,…
Inverse Probability of Treatment Weighting with Deep Sequence Models Enables Accurate treatment effect Estimation from…
Self-attention-based non-linear basis transformations for compact latent space modelling of dynamic optical fibre transmission matricesby…
Non-autoregressive Personalized Bundle Generationby Wenchuan Yang, Cheng Yang, Jichao Li, Yuejin Tan, Xin Lu, Chuan…
CMamba: Channel Correlation Enhanced State Space Models for Multivariate Time Series Forecastingby Chaolv Zeng, Zhanyu…
Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMsby Shentong MoFirst submitted to arxiv…
Faithful and Accurate Self-Attention Attribution for Message Passing Neural Networks via the Computation Tree Viewpointby…