Summary of Structural-entropy-based Sample Selection For Efficient and Effective Learning, by Tianchi Xie et al.
Structural-Entropy-Based Sample Selection for Efficient and Effective Learning
by Tianchi Xie, Jiangning Zhu, Guozu Ma, Minzhi Lin, Wei Chen, Weikai Yang, Shixia Liu
First submitted to arxiv on: 3 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel approach to sample selection in machine learning, which improves the efficiency and effectiveness of models by providing informative and representative samples. The authors model samples as a graph where nodes are samples and edges represent similarities between them. Most existing methods overlook global information, such as connectivity patterns, which can result in suboptimal selection. To address this issue, the authors employ structural entropy to quantify global information and decompose it from the whole graph to individual nodes using the Shapley value. They present SES (Structural-Entropy-based Sample Selection), a method that integrates both global and local information to select informative and representative samples. SES constructs a kNN-graph among samples based on their similarities, measures sample importance by combining structural entropy with training difficulty, and applies importance-biased blue noise sampling to select diverse and representative samples. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us better choose which examples we use to train machine learning models. It’s like taking a good picture of a city – you want to get the right balance of buildings, roads, and people. Right now, most methods only look at what each example is like individually, without considering how they all fit together. This can lead to poor choices. To fix this, the authors use a new way to measure how important each example is based on its relationships with other examples. They then use this information to choose the best examples for training. The results show that their method works well in different learning scenarios. |
Keywords
* Artificial intelligence * Machine learning