Summary of 1-bit Fqt: Pushing the Limit Of Fully Quantized Training to 1-bit, by Chang Gao et al.
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bitby Chang Gao, Jianfei Chen,…
1-Bit FQT: Pushing the Limit of Fully Quantized Training to 1-bitby Chang Gao, Jianfei Chen,…
MPruner: Optimizing Neural Network Size with CKA-Based Mutual Information Pruningby Seungbeom Hu, ChanJun Park, Andrew…
Exploiting Student Parallelism for Efficient GPU Inference of BERT-like Models in Online Servicesby Weiyan Wang,…
Pruning By Explaining Revisited: Optimizing Attribution Methods to Prune CNNs and Transformersby Sayed Mohammad Vakilzadeh…
Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisationby Nishan Gunawardena, Gough Yumu Lui,…
LLM Pruning and Distillation in Practice: The Minitron Approachby Sharath Turuvekere Sreenivas, Saurav Muralidharan, Raviraj…
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanismby Guanchen Li, Xiandong Zhao, Lian Liu,…
Single-cell Curriculum Learning-based Deep Graph Embedding Clusteringby Huifa Li, Jie Fu, Xinpeng Ling, Zhiyu Sun,…
LLM-Barber: Block-Aware Rebuilder for Sparsity Mask in One-Shot for Large Language Modelsby Yupeng Su, Ziyi…
Research on Personalized Compression Algorithm for Pre-trained Models Based on Homomorphic Entropy Increaseby Yicong Li,…