Loading Now

Summary of Theoretical Proportion Label Perturbation For Learning From Label Proportions in Large Bags, by Shunsuke Kubo et al.


Theoretical Proportion Label Perturbation for Learning from Label Proportions in Large Bags

by Shunsuke Kubo, Shinnosuke Matsuo, Daiki Suehiro, Kazuhiro Terada, Hiroaki Ito, Akihiko Yoshizawa, Ryoma Bise

First submitted to arxiv on: 26 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This study proposes a novel weakly supervised learning method, Learning from Label Proportions (LLP), which trains an instance-level classifier from label proportions of bags without using instance labels. The traditional LLP methods are challenging when dealing with large-sized bags due to GPU memory limitations. To address this issue, the authors introduce a mini-bag generation approach by sampling instances from original bags and using these mini-bags in place of the originals. However, the proportion of mini-bags is unknown and differs from that of original bags, leading to overfitting. The proposed perturbation method, based on the multivariate hypergeometric distribution, is implemented to mitigate this overfitting. Additionally, loss weighting is applied to reduce the negative impact of proportions sampled from the tail of the distribution. Experimental results demonstrate comparable classification accuracy to traditional methods without sampling.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study aims to solve a problem in machine learning called Learning from Label Proportions (LLP). This means teaching a computer program how to identify things based on the labels or categories that other things belong to, even if it doesn’t know what those specific things are. The authors came up with a new way to do this by breaking down big groups of things into smaller ones and using those smaller groups to train the program. However, there’s a risk that the program will get too good at recognizing patterns in these small groups and forget how to recognize the bigger categories it was supposed to learn from. To fix this problem, the authors came up with a way to slightly change the labels of these small groups so they’re not too perfect. They also found a way to make sure that when the program is learning from these changed labels, it’s not too affected by any patterns it finds in those labels. This new method was tested and worked just as well as more traditional methods.

Keywords

» Artificial intelligence  » Classification  » Machine learning  » Overfitting  » Supervised