Loading Now

Summary of Progressive Inference: Explaining Decoder-only Sequence Classification Models Using Intermediate Predictions, by Sanjay Kariyappa et al.


Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions

by Sanjay Kariyappa, Freddy Lécué, Saumitra Mishra, Christopher Pond, Daniele Magazzeni, Manuela Veloso

First submitted to arxiv on: 3 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Progressive Inference framework computes input attributions for decoder-only sequence classification models by leveraging the intermediate predictions made by their classification heads. This approach takes advantage of the causal attention mechanism, allowing it to obtain predictions on masked input subsequences with minimal computational overhead. The authors develop two methods, Single Pass-Progressive Inference (SP-PI) and Multi Pass-Progressive Inference (MP-PI), which use intermediate predictions from consecutive or multiple masked versions of the input to compute attributions. Experimental results on various text classification tasks demonstrate that SP-PI and MP-PI outperform prior work in providing accurate attributions.
Low GrooveSquid.com (original content) Low Difficulty Summary
The researchers created a way to explain how decoder-only sequence classification models make their predictions. They did this by looking at what these models are doing as they process the input data, rather than just at the final prediction. This approach uses something called “causal attention” which allows it to get information from earlier parts of the input without having to re-compute everything. The researchers developed two methods to use this idea and found that their approach was better at explaining model predictions than previous techniques.

Keywords

» Artificial intelligence  » Attention  » Classification  » Decoder  » Inference  » Text classification