Loading Now

Summary of Uniform Discretized Integrated Gradients: An Effective Attribution Based Method For Explaining Large Language Models, by Swarnava Sinha Roy and Ayan Kundu


Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models

by Swarnava Sinha Roy, Ayan Kundu

First submitted to arxiv on: 5 Dec 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a novel method called Uniform Discretized Integrated Gradients (UDIG) for explaining Large Language Models (LLMs). The technique is designed to address the limitations of traditional Integrated Gradients when dealing with discrete feature spaces, such as word embeddings. UDIG employs a new interpolation strategy that allows for non-linear paths in the embedding space, making it more suitable for predictive language models. The method is evaluated on two NLP tasks: Sentiment Classification and Question Answering using three metrics: Log odds, Comprehensiveness, and Sufficiency. The results show that UDIG outperforms existing methods in almost all metrics, with benchmarks conducted on datasets such as SST2, IMDb, Rotten Tomatoes, and SQuAD.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about a new way to understand how language models make decisions. Right now, there’s a problem when trying to explain why these models are making certain predictions, especially when dealing with words or phrases that are close together in meaning. The authors propose a solution called Uniform Discretized Integrated Gradients (UDIG) that can handle this issue better than current methods. They test their approach on two types of tasks: identifying the sentiment of text and answering questions based on text. Their method outperforms others in many cases, which is important for making language models more useful and reliable.

Keywords

» Artificial intelligence  » Classification  » Embedding space  » Nlp  » Question answering