Summary of Helpful or Harmful Data? Fine-tuning-free Shapley Attribution For Explaining Language Model Predictions, by Jingtan Wang et al.
Helpful or Harmful Data? Fine-tuning-free Shapley Attribution for Explaining Language Model Predictions
by Jingtan Wang, Xiaoqiang Lin, Rui Qiao, Chuan-Sheng Foo, Bryan Kian Hsiang Low
First submitted to arxiv on: 7 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper addresses the crucial issue of explainability in fine-tuning, a widely used method for adapting models to downstream tasks. Specifically, it focuses on instance attribution, which assigns scores to training examples based on their contribution to model predictions. However, the robustness of these instance scores has been overlooked, particularly with regards to dataset resampling. To address this gap, the authors propose a notion of robustness for instance scores and demonstrate theoretically and empirically that popular leave-one-out-based methods lack robustness. Instead, they introduce an efficient fine-tuning-free approximation of the Shapley value called FreeShap, which outperforms other methods for instance attribution and data-centric applications such as data removal, selection, and wrong label detection. The authors also generalize their approach to large language models (LLMs). Their code is available online. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper explains how we can better understand why AI models make certain predictions. Right now, fine-tuning is a popular way to adapt models for new tasks, but it’s hard to know what makes one model work better than another. One idea is to look at individual training examples and see which ones are most important for the model’s predictions. However, this “instance attribution” approach hasn’t been tested well on large datasets or when parts of the dataset are missing. The authors of this paper want to fix that by introducing a new way to calculate instance scores that is more reliable and efficient. |
Keywords
» Artificial intelligence » Fine tuning