Summary of Length-induced Embedding Collapse in Transformer-based Models, by Yuqi Zhou et al.
Length-Induced Embedding Collapse in Transformer-based Modelsby Yuqi Zhou, Sunhao Dai, Zhanshuo Cao, Xiao Zhang, Jun…
Length-Induced Embedding Collapse in Transformer-based Modelsby Yuqi Zhou, Sunhao Dai, Zhanshuo Cao, Xiao Zhang, Jun…
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Accelerationby Dezhan Tu, Danylo…
Kernel Looping: Eliminating Synchronization Boundaries for Peak Inference Performanceby David Koeplinger, Darshan Gandhi, Pushkar Nandkar,…
EDT: An Efficient Diffusion Transformer Framework Inspired by Human-like Sketchingby Xinwang Chen, Ning Liu, Yichen…
LARP: Tokenizing Videos with a Learned Autoregressive Generative Priorby Hanyu Wang, Saksham Suri, Yixuan Ren,…
Relation-based Counterfactual Data Augmentation and Contrastive Learning for Robustifying Natural Language Inference Modelsby Heerin Yang,…
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuningby Xiangyu Zeng, Kunchang Li, Chenting…
Meaning Typed Prompting: A Technique for Efficient, Reliable Structured Output Generationby Chandra IrugalbandaraFirst submitted to…
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inferenceby Xin He, Shunkang Zhang,…
Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validationby Suho Kang, Jungyang Park, Joonseo…