Summary of Retraining with Predicted Hard Labels Provably Increases Model Accuracy, by Rudrajit Das et al.

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

by Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

First submitted to arxiv on: 17 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates the theoretical benefits of retraining a model using its own predicted labels, specifically in scenarios where labels are noisy and linearly separable. The authors prove that retraining can improve population accuracy when initially trained on noisy labels. This phenomenon has implications for training models with local label differential privacy (DP), which involves noisy labels. The paper demonstrates how consensus-based retraining selectively improves DP training without compromising privacy, achieving a 6.4% accuracy boost in the case of ResNet-18 on CIFAR-100 with epsilon=3 label DP.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper explores why retraining a model using its own predicted labels can improve performance when trained with noisy labels. The authors show that this technique works by proving a theoretical result about improving population accuracy. This has important implications for training models with local label differential privacy, which requires noisy labels. The study demonstrates how to use consensus-based retraining to get better results without sacrificing privacy.

Keywords

* Artificial intelligence * Resnet

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

by Rudrajit Das, Inderjit S. Dhillon, Alessandro Epasto, Adel Javanmard, Jieming Mao, Vahab Mirrokni, Sujay Sanghavi, Peilin Zhong

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Multimodal Needle in a Haystack: Benchmarking Long-context Capability Of Multimodal Large Language Models, by Hengyi Wang et al.

Summary of Probing the Decision Boundaries Of In-context Learning in Large Language Models, by Siyan Zhao et al.

Related Posts