Summary of Semi-supervised Counting Via Pixel-by-pixel Density Distribution Modelling, by Hui Lin and Zhiheng Ma and Rongrong Ji and Yaowei Wang and Zhou Su and Xiaopeng Hong and Deyu Meng
Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling
by Hui Lin, Zhiheng Ma, Rongrong Ji, Yaowei Wang, Zhou Su, Xiaopeng Hong, Deyu Meng
First submitted to arxiv on: 23 Feb 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed semi-supervised crowd-counting model regresses pixel-wise density values as probability distributions, addressing the limitation of traditional single-value deterministic approaches. The model incorporates a distribution matching loss to measure differences between predicted and ground truth distributions. Additionally, it enhances the transformer decoder using density tokens to specialize forwards for different density intervals. An interleaving consistency self-supervised learning mechanism is designed to efficiently learn from unlabeled data. Experimental results on four datasets demonstrate significant performance improvements under various labeled ratio settings, outperforming competitors. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine you’re trying to count people in a big crowd, but only some of the pictures have labels. This paper helps by counting pixels instead! It makes the density values (how many people are at each spot) into probabilities, which is better than just using one number. The new model has three parts: matching loss for good predictions, special decoder tokens for different density levels, and a self-teacher that learns from unlabeled pictures. This method beats others in counting crowds with varying amounts of labeled training data. |
Keywords
* Artificial intelligence * Decoder * Probability * Self supervised * Semi supervised * Transformer