Summary of Linear Attention Sequence Parallelism, by Weigao Sun et al.
Linear Attention Sequence Parallelismby Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran…
Linear Attention Sequence Parallelismby Weigao Sun, Zhen Qin, Dong Li, Xuyang Shen, Yu Qiao, Yiran…
Cross-Architecture Transfer Learning for Linear-Cost Inference Transformersby Sehyun ChoiFirst submitted to arxiv on: 3 Apr…
Enhancing Diffusion-based Point Cloud Generation with Smoothness Constraintby Yukun Li, Liping LiuFirst submitted to arxiv…
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasksby Xingwu Chen,…
Position-Aware Parameter Efficient Fine-Tuning Approach for Reducing Positional Bias in LLMsby Zheng Zhang, Fan Yang,…
Transfer Learning with Point Transformersby Kartik Gupta, Rahul Vippala, Sahima SrivastavaFirst submitted to arxiv on:…
On Difficulties of Attention Factorization through Shared Memoryby Uladzislau Yorsh, Martin Holeňa, Ondřej Bojar, David…
A Multi-Branched Radial Basis Network Approach to Predicting Complex Chaotic Behavioursby Aarush SinhaFirst submitted to…
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMsby Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo…
Attention-based Shape-Deformation Networks for Artifact-Free Geometry Reconstruction of Lumbar Spine from MR Imagesby Linchen Qian,…