Summary of Aptq: Attention-aware Post-training Mixed-precision Quantization For Large Language Models, by Ziyi Guan et al.
APTQ: Attention-aware Post-Training Mixed-Precision Quantization for Large Language Modelsby Ziyi Guan, Hantao Huang, Yupeng Su,…