Loading Now

Summary of Elfs: Label-free Coreset Selection with Proxy Training Dynamics, by Haizhong Zheng et al.


ELFS: Label-Free Coreset Selection with Proxy Training Dynamics

by Haizhong Zheng, Elisa Tsai, Yifu Lu, Jiachen Sun, Brian R. Bartoldson, Bhavya Kailkhura, Atul Prakash

First submitted to arxiv on: 6 Jun 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed paper introduces a novel label-free coreset selection method called ELFS (Effective Label-Free Coreset Selection) that improves the performance of existing methods for selecting informative and representative data subsets for human annotation. The authors address two challenges in current state-of-the-art (SOTA) label-free coreset selection methods: estimating training dynamics-based data difficulty scores without ground truth labels and mitigating bias on calculated scores. ELFS utilizes deep clustering to estimate data difficulty scores and proposes a double-end pruning method to reduce bias. The paper evaluates ELFS on four vision benchmarks, demonstrating consistent performance improvements over SOTA label-free baselines when using the same vision encoder.
Low GrooveSquid.com (original content) Low Difficulty Summary
ELFS is a new way to pick important parts of big datasets so that humans only have to look at those parts instead of the whole thing. This makes it faster and cheaper to prepare data for machine learning models. The old ways of doing this required a lot of already-annotated data, but ELFS doesn’t need any labels. It uses special math tricks to figure out which pieces of the dataset are most important. The authors tested ELFS on four different datasets and found that it worked better than other methods in many cases.

Keywords

» Artificial intelligence  » Clustering  » Encoder  » Machine learning  » Pruning