Summary of Accelerating Transformer Pre-training with 2:4 Sparsity, by Yuezhou Hu et al.
Accelerating Transformer Pre-training with 2:4 Sparsityby Yuezhou Hu, Kang Zhao, Weiyu Huang, Jianfei Chen, Jun…
Accelerating Transformer Pre-training with 2:4 Sparsityby Yuezhou Hu, Kang Zhao, Weiyu Huang, Jianfei Chen, Jun…
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasksby Xingwu Chen,…
Audio Simulation for Sound Source Localization in Virtual Evironmentby Yi Di Yuan, Swee Liang Wong,…
Transformer meets wcDTW to improve real-time battery bids: A new approach to scenario selectionby Sujal…
ContrastCAD: Contrastive Learning-based Representation Learning for Computer-Aided Design Modelsby Minseop Jung, Minseong Kim, Jibum KimFirst…
Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generationby Harry Dong, Beidi Chen, Yuejie ChiFirst submitted…
Intelligent Learning Rate Distribution to reduce Catastrophic Forgetting in Transformersby Philip Kenneweg, Alexander Schulz, Sarah…
Prompt Learning for Oriented Power Transmission Tower Detection in High-Resolution SAR Imagesby Tianyang Li, Chao…
On Difficulties of Attention Factorization through Shared Memoryby Uladzislau Yorsh, Martin Holeňa, Ondřej Bojar, David…
A General and Efficient Training for Transformer via Token Expansionby Wenxuan Huang, Yunhang Shen, Jiao…