Loading Now

Summary of Uncertainty Quantification Of Data Shapley Via Statistical Inference, by Mengmeng Wu et al.


Uncertainty Quantification of Data Shapley via Statistical Inference

by Mengmeng Wu, Zhihong Liu, Xiang Li, Ruoxi Jia, Xiangyu Chang

First submitted to arxiv on: 28 Jul 2024

Categories

  • Main: Machine Learning (stat.ML)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel approach is proposed in this paper to address the limitation of Data Shapley, a widely used method for data valuation in machine learning. Specifically, the authors establish a connection between Data Shapley and infinite-order U-statistics, allowing them to quantify the uncertainty of Data Shapley with changes in data distribution from the perspective of U-statistics. The study presents two algorithms for estimating this uncertainty and provides recommendations for their application. Experimental results on various datasets verify asymptotic normality and demonstrate the practical applicability of this method in a trading scenario.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper finds a way to make sure we’re valuing data correctly, even when it’s changing or growing. The researchers connect Data Shapley, a popular method for doing this, to another idea called infinite-order U-statistics. This helps them understand how much the valuation might change if the data changes. They come up with two ways to estimate this uncertainty and show that their ideas work on different datasets. This could be useful in real-world applications, like trading.

Keywords

» Artificial intelligence  » Machine learning