Summary of Hybrid Dynamic Pruning: a Pathway to Efficient Transformer Inference, by Ghadeer Jaradat et al.
Hybrid Dynamic Pruning: A Pathway to Efficient Transformer Inferenceby Ghadeer Jaradat, Mohammed Tolba, Ghada Alsuhli,…
Hybrid Dynamic Pruning: A Pathway to Efficient Transformer Inferenceby Ghadeer Jaradat, Mohammed Tolba, Ghada Alsuhli,…
A Depression Detection Method Based on Multi-Modal Feature Fusion Using Cross-Attentionby Shengjie Li, Yinhao XiaoFirst…
Enhancing the Utility of Privacy-Preserving Cancer Classification using Synthetic Databy Richard Osuala, Daniel M. Lang,…
When can transformers compositionally generalize in-context?by Seijin Kobayashi, Simon Schug, Yassir Akram, Florian Redhardt, Johannes…
Co-Designing Binarized Transformer and Hardware Accelerator for Efficient End-to-End Edge Deploymentby Yuhao Ji, Chao Fang,…
A Graph-based Adversarial Imitation Learning Framework for Reliable & Realtime Fleet Scheduling in Urban Air…
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Trainingby Pinxue Zhao, Hailin Zhang, Fangcheng Fu,…
Understanding Transformers via N-gram Statisticsby Timothy NguyenFirst submitted to arxiv on: 30 Jun 2024CategoriesMain: Computation…
Exploring Quantization for Efficient Pre-Training of Transformer Language Modelsby Kamran Chitsaz, Quentin Fournier, Gonçalo Mordido,…
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matricesby Jung Hyun…