Loading Now

Summary of Reagent: a Model-agnostic Feature Attribution Method For Generative Language Models, by Zhixue Zhao et al.


ReAGent: A Model-agnostic Feature Attribution Method for Generative Language Models

by Zhixue Zhao, Boxuan Shan

First submitted to arxiv on: 1 Feb 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper presents a new feature attribution method called Recursive Attribution Generator (ReAGent) for generative language models. This approach addresses the limitations of existing methods, which were designed for encoder-only models and classification tasks. ReAGent updates the token importance distribution in a recursive manner by computing the difference between predicting tokens with original input and modified input where part of the context is replaced with RoBERTa predictions. The method can be universally applied to any generative LM without accessing internal model weights or additional training and fine-tuning. The paper compares the faithfulness of ReAGent with seven popular FAs across six decoder-only LMs of various sizes, showing that ReAGent consistently provides more faithful token importance distributions.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about a new way to figure out which parts of text are most important for language models to make predictions. Right now, there are many different methods people use to do this, but they were mostly developed for language models that just process text and don’t generate new text. The problem with these methods is that they might not work well for language models that actually create new text. To solve this problem, the authors created a new method called ReAGent that can be used with any kind of language model that generates text. This method works by looking at how much the language model changes its predictions when you change different parts of the input text. The paper shows that ReAGent is better than other methods for figuring out which parts of text are most important.

Keywords

* Artificial intelligence  * Classification  * Decoder  * Encoder  * Fine tuning  * Language model  * Token