Summary of Progressive Fine-to-coarse Reconstruction For Accurate Low-bit Post-training Quantization in Vision Transformers, by Rui Ding et al.

Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

by Rui Ding, Liang Yong, Sihuan Zhao, Jing Nie, Lihui Chen, Haijun Liu, Xichuan Zhou

First submitted to arxiv on: 19 Dec 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper proposes a Progressive Fine-to-Coarse Reconstruction (PFCR) method for accurate Post-Training Quantization (PTQ) of Vision Transformers (ViTs). The current PTQ framework often suffers from a significant performance drop when quantized into low-bit representations. Existing methods predefine the reconstruction granularity, leading to sub-optimal results. PFCR combines fine-grained units (multi-head self-attention and multi-layer perceptron modules with shortcuts) to form coarser blocks, iteratively reconstructing at different granularities. The paper also introduces a Progressive Optimization Strategy (POS) for PFCR to enhance model performance. Experimental results on ImageNet and COCO datasets demonstrate the effectiveness of the proposed method.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper solves a problem in computer vision by improving how machines can be compressed without losing accuracy. This is important because it makes it easier to use powerful machines like Vision Transformers (ViTs) on devices with limited memory or processing power. The authors propose a new way to compress ViTs, called Progressive Fine-to-Coarse Reconstruction (PFCR). This method works by breaking down the machine into smaller parts and then combining them in a specific order. The authors also introduce an optimization strategy to make it easier to train the model. The results show that this approach is better than other methods for compressing ViTs.

Keywords

* Artificial intelligence * Optimization * Quantization * Self attention

Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers

by Rui Ding, Liang Yong, Sihuan Zhao, Jing Nie, Lihui Chen, Haijun Liu, Xichuan Zhou

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Surrealistic-like Image Generation with Vision-language Models, by Elif Ayten et al.

Summary of Progressive Multimodal Reasoning Via Active Retrieval, by Guanting Dong et al.

Related Posts