Summary of Your Transformer Is Secretly Linear, by Anton Razzhigaev et al.
Your Transformer is Secretly Linearby Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan Oseledets,…
Your Transformer is Secretly Linearby Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan Oseledets,…
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?by Yang Dai, Oubo Ma, Longfei…
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference…
Asymptotic theory of in-context learning by linear attentionby Yue M. Lu, Mary I. Letey, Jacob…
Efficiency optimization of large-scale language models based on deep learning in natural language processing tasksby…
Review of deep learning models for crypto price prediction: implementation and evaluationby Jingyang Wu, Xinyi…
NetMamba: Efficient Network Traffic Classification via Pre-training Unidirectional Mambaby Tongze Wang, Xiaohui Xie, Wenduo Wang,…
VCformer: Variable Correlation Transformer with Inherent Lagged Correlation for Multivariate Time Series Forecastingby Yingnan Yang,…
A Dual Power Grid Cascading Failure Model for the Vulnerability Analysisby Tianxin Zhou, Xiang Li,…
LiPost: Improved Content Understanding With Effective Use of Multi-task Contrastive Learningby Akanksha Bindal, Sudarshan Ramanujam,…