Summary of Progressive Inference: Explaining Decoder-only Sequence Classification Models Using Intermediate Predictions, by Sanjay Kariyappa et al.
Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions
by Sanjay Kariyappa, Freddy Lécué, Saumitra Mishra, Christopher Pond, Daniele Magazzeni, Manuela Veloso
First submitted to arxiv on: 3 Jun 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed Progressive Inference framework computes input attributions for decoder-only sequence classification models by leveraging the intermediate predictions made by their classification heads. This approach takes advantage of the causal attention mechanism, allowing it to obtain predictions on masked input subsequences with minimal computational overhead. The authors develop two methods, Single Pass-Progressive Inference (SP-PI) and Multi Pass-Progressive Inference (MP-PI), which use intermediate predictions from consecutive or multiple masked versions of the input to compute attributions. Experimental results on various text classification tasks demonstrate that SP-PI and MP-PI outperform prior work in providing accurate attributions. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The researchers created a way to explain how decoder-only sequence classification models make their predictions. They did this by looking at what these models are doing as they process the input data, rather than just at the final prediction. This approach uses something called “causal attention” which allows it to get information from earlier parts of the input without having to re-compute everything. The researchers developed two methods to use this idea and found that their approach was better at explaining model predictions than previous techniques. |
Keywords
» Artificial intelligence » Attention » Classification » Decoder » Inference » Text classification