Loading Now

Summary of Semi-supervised Counting Via Pixel-by-pixel Density Distribution Modelling, by Hui Lin and Zhiheng Ma and Rongrong Ji and Yaowei Wang and Zhou Su and Xiaopeng Hong and Deyu Meng


Semi-supervised Counting via Pixel-by-pixel Density Distribution Modelling

by Hui Lin, Zhiheng Ma, Rongrong Ji, Yaowei Wang, Zhou Su, Xiaopeng Hong, Deyu Meng

First submitted to arxiv on: 23 Feb 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed semi-supervised crowd-counting model regresses pixel-wise density values as probability distributions, addressing the limitation of traditional single-value deterministic approaches. The model incorporates a distribution matching loss to measure differences between predicted and ground truth distributions. Additionally, it enhances the transformer decoder using density tokens to specialize forwards for different density intervals. An interleaving consistency self-supervised learning mechanism is designed to efficiently learn from unlabeled data. Experimental results on four datasets demonstrate significant performance improvements under various labeled ratio settings, outperforming competitors.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine you’re trying to count people in a big crowd, but only some of the pictures have labels. This paper helps by counting pixels instead! It makes the density values (how many people are at each spot) into probabilities, which is better than just using one number. The new model has three parts: matching loss for good predictions, special decoder tokens for different density levels, and a self-teacher that learns from unlabeled pictures. This method beats others in counting crowds with varying amounts of labeled training data.

Keywords

* Artificial intelligence  * Decoder  * Probability  * Self supervised  * Semi supervised  * Transformer