Summary of Uncertainty Quantification Of Data Shapley Via Statistical Inference, by Mengmeng Wu et al.
Uncertainty Quantification of Data Shapley via Statistical Inference
by Mengmeng Wu, Zhihong Liu, Xiang Li, Ruoxi Jia, Xiangyu Chang
First submitted to arxiv on: 28 Jul 2024
Categories
- Main: Machine Learning (stat.ML)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach is proposed in this paper to address the limitation of Data Shapley, a widely used method for data valuation in machine learning. Specifically, the authors establish a connection between Data Shapley and infinite-order U-statistics, allowing them to quantify the uncertainty of Data Shapley with changes in data distribution from the perspective of U-statistics. The study presents two algorithms for estimating this uncertainty and provides recommendations for their application. Experimental results on various datasets verify asymptotic normality and demonstrate the practical applicability of this method in a trading scenario. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper finds a way to make sure we’re valuing data correctly, even when it’s changing or growing. The researchers connect Data Shapley, a popular method for doing this, to another idea called infinite-order U-statistics. This helps them understand how much the valuation might change if the data changes. They come up with two ways to estimate this uncertainty and show that their ideas work on different datasets. This could be useful in real-world applications, like trading. |
Keywords
» Artificial intelligence » Machine learning