Summary of Beyond Linear Approximations: a Novel Pruning Approach For Attention Matrix, by Yingyu Liang et al.
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrixby Yingyu Liang, Jiangxuan Long, Zhenmei…
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrixby Yingyu Liang, Jiangxuan Long, Zhenmei…
Bypassing the Exponential Dependency: Looped Transformers Efficiently Learn In-context by Multi-step Gradient Descentby Bo Chen,…
UmambaTSF: A U-shaped Multi-Scale Long-Term Time Series Forecasting Method Using Mambaby Li Wu, Wenbin Pei,…
TraM : Enhancing User Sleep Prediction with Transformer-based Multivariate Time Series Modeling and Machine Learning…
Rethinking Graph Transformer Architecture Design for Node Classificationby Jiajun Zhou, Xuanze Chen, Chenxuan Xie, Yu…
Towards Better Multi-head Attention via Channel-wise Sample Permutationby Shen Yuan, Hongteng XuFirst submitted to arxiv…
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysisby Weronika…
HART: Efficient Visual Generation with Hybrid Autoregressive Transformerby Haotian Tang, Yecheng Wu, Shang Yang, Enze…
Transforming Game Play: A Comparative Study of DCQN and DTQN Architectures in Reinforcement Learningby William…
Transparent Networks for Multivariate Time Seriesby Minkyu Kim, Suan Lee, Jinho KimFirst submitted to arxiv…