Summary of Maskllm: Learnable Semi-structured Sparsity For Large Language Models, by Gongfan Fang et al.
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Modelsby Gongfan Fang, Hongxu Yin, Saurav Muralidharan, Greg…
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Modelsby Gongfan Fang, Hongxu Yin, Saurav Muralidharan, Greg…
Graph Similarity Regularized Softmax for Semi-Supervised Node Classificationby Yiming Yang, Jun Liu, Wei WanFirst submitted…
Embedding Geometries of Contrastive Language-Image Pre-Trainingby Jason Chuan-Chih Chou, Nahid AlamFirst submitted to arxiv on:…
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformersby Siyu Chen, Heejune Sheen,…
Learning large softmax mixtures with warm start EMby Xin Bing, Florentina Bunea, Jonathan Niles-Weed, Marten…
OPAL: Outlier-Preserved Microscaling Quantization Accelerator for Generative Large Language Modelsby Jahyun Koo, Dahoon Park, Sangwoo…
Low Latency Transformer Inference on FPGAs for Physics Applications with hls4mlby Zhixing Jiang, Dennis Yin,…
Theory, Analysis, and Best Practices for Sigmoid Self-Attentionby Jason Ramapuram, Federico Danieli, Eeshan Dhekane, Floris…
Whittle Index Learning Algorithms for Restless Bandits with Constant Stepsizesby Vishesh Mittal, Rahul Meshram, Surya…
A Fresh Take on Stale Embeddings: Improving Dense Retriever Training with Corrector Networksby Nicholas Monath,…