Summary of Editing Arbitrary Propositions in Llms Without Subject Labels, by Itai Feigenbaum et al.
Editing Arbitrary Propositions in LLMs without Subject Labels
by Itai Feigenbaum, Devansh Arpit, Huan Wang, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Caiming Xiong, Silvio Savarese
First submitted to arxiv on: 15 Jan 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel approach to editing Large Language Models (LLMs) to modify their responses to specific propositions while preserving their overall accuracy. The authors propose a Locate-and-Edit (L&E) method called Gradient Tracing (GT), which can edit arbitrary propositions without relying on semantic subject labels. This is achieved by tracing the gradient of the model’s response to locate the relevant information and then editing it using a mild variant of Rank-One Model Editing (ROME). The authors demonstrate the effectiveness of their approach on datasets of binary propositions derived from the CounterFact dataset, achieving results comparable to state-of-the-art L&E methods that rely on subject labels. Furthermore, they introduce a new dataset, Factual Accuracy Classification Test (FACT), which includes non-binary propositions and is beyond the scope of existing L&E methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us edit Large Language Models to make them more accurate. Right now, these models can get facts wrong if we ask them tricky questions. The authors came up with a new way to fix this problem by finding where the model stores information about specific facts and then editing that information. This method is really fast and doesn’t need special labels for each fact. They tested it on some examples and got results that are almost as good as other methods that use those special labels. The authors also created a new test dataset with tricky questions that even the best current methods can’t handle, but their method can! |
Keywords
* Artificial intelligence * Classification