Summary of Progressive Fine-to-coarse Reconstruction For Accurate Low-bit Post-training Quantization in Vision Transformers, by Rui Ding et al.
Progressive Fine-to-Coarse Reconstruction for Accurate Low-Bit Post-Training Quantization in Vision Transformers
by Rui Ding, Liang Yong, Sihuan Zhao, Jing Nie, Lihui Chen, Haijun Liu, Xichuan Zhou
First submitted to arxiv on: 19 Dec 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a Progressive Fine-to-Coarse Reconstruction (PFCR) method for accurate Post-Training Quantization (PTQ) of Vision Transformers (ViTs). The current PTQ framework often suffers from a significant performance drop when quantized into low-bit representations. Existing methods predefine the reconstruction granularity, leading to sub-optimal results. PFCR combines fine-grained units (multi-head self-attention and multi-layer perceptron modules with shortcuts) to form coarser blocks, iteratively reconstructing at different granularities. The paper also introduces a Progressive Optimization Strategy (POS) for PFCR to enhance model performance. Experimental results on ImageNet and COCO datasets demonstrate the effectiveness of the proposed method. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper solves a problem in computer vision by improving how machines can be compressed without losing accuracy. This is important because it makes it easier to use powerful machines like Vision Transformers (ViTs) on devices with limited memory or processing power. The authors propose a new way to compress ViTs, called Progressive Fine-to-Coarse Reconstruction (PFCR). This method works by breaking down the machine into smaller parts and then combining them in a specific order. The authors also introduce an optimization strategy to make it easier to train the model. The results show that this approach is better than other methods for compressing ViTs. |
Keywords
» Artificial intelligence » Optimization » Quantization » Self attention