Summary of Lossval: Efficient Data Valuation For Neural Networks, by Tim Wibiral et al.
LossVal: Efficient Data Valuation for Neural Networks
by Tim Wibiral, Mohamed Karim Belaid, Maximilian Rabus, Ansgar Scherp
First submitted to arxiv on: 5 Dec 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces LossVal, an efficient data valuation method that computes importance scores during neural network training. This approach embeds a self-weighting mechanism into loss functions like cross-entropy and mean squared error, reducing computational costs while ignoring dependencies between data points. The authors demonstrate the effectiveness of LossVal in identifying noisy samples and distinguishing helpful from harmful ones across multiple datasets. The proposed method is suitable for large-scale applications and can be used to assess the importance of individual training samples. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary LossVal is a new way to figure out which data points are most important in machine learning models. Right now, people usually retrain their models with and without certain data points, but this is slow and doesn’t take into account how different pieces of data relate to each other. LossVal makes it faster and more efficient by adding a special mechanism to the loss function that calculates importance scores during training. This helps identify noisy or unhelpful data points, making it a valuable tool for big datasets and real-world applications. |
Keywords
» Artificial intelligence » Cross entropy » Loss function » Machine learning » Neural network