Summary of Tiny Models Are the Computational Saver For Large Models, by Qingyuan Wang et al.
Tiny Models are the Computational Saver for Large Modelsby Qingyuan Wang, Barry Cardiff, Antoine Frappé,…
Tiny Models are the Computational Saver for Large Modelsby Qingyuan Wang, Barry Cardiff, Antoine Frappé,…
LLM Inference Unveiled: Survey and Roofline Model Insightsby Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen…
From Cloud to Edge: Rethinking Generative AI for Low-Resource Design Challengesby Sai Krishna Revanth Vuruma,…
Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategyby Seyedarmin Azizi, Mahdi Nazemi, Massoud PedramFirst submitted…
Model Compression Techniques in Biometrics Applications: A Surveyby Eduarda Caldeira, Pedro C. Neto, Marco Huber,…
Safety and Performance, Why not Both? Bi-Objective Optimized Model Compression toward AI Software Deploymentby Jie…
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMsby Lanxiang Hu, Tajana Rosing, Hao ZhangFirst submitted to…
Optimising TinyML with Quantization and Distillation of Transformer and Mamba Models for Indoor Localisation on…
Low-Rank Correction for Quantized LLMsby Meyer Scetbon, James HensmanFirst submitted to arxiv on: 10 Dec…
Lossless Model Compression via Joint Low-Rank Factorization Optimizationby Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin…