Summary of Knowledge Editing in Language Models Via Adapted Direct Preference Optimization, by Amit Rozner et al.

Knowledge Editing in Language Models via Adapted Direct Preference Optimization

by Amit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum

First submitted to arxiv on: 14 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Knowledge Direct Preference Optimization (KDPO) method addresses the issue of Large Language Models (LLMs) becoming outdated by introducing weight updates that do not require expensive retraining. This is achieved by treating KE as an LLM alignment problem, utilizing online approaches that continually update the knowledge stored in the model. The technique involves using current knowledge as a negative sample and new knowledge to be introduced as a positive sample in a process called DPO. Teacher-forcing for negative sample generation and optimization using the positive sample helps maintain localized changes. Experimental results show that KDPO allows for more refined KE, achieving similar or better performance compared to previous methods.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Large Language Models (LLMs) can become outdated over time as they may lack updated world knowledge, leading to factual knowledge errors and gaps. A new method called Knowledge Editing (KE) aims to overcome this challenge by updating the model without requiring expensive retraining. The idea is to treat KE as an LLM alignment problem. The proposed KDPO method uses current knowledge as a negative sample and new knowledge to be introduced as a positive sample in a process called DPO. This helps maintain localized changes. The results show that KDPO allows for more refined KE, achieving similar or better performance compared to previous methods.

Keywords

» Artificial intelligence » Alignment » Optimization

Knowledge Editing in Language Models via Adapted Direct Preference Optimization

by Amit Rozner, Barak Battash, Lior Wolf, Ofir Lindenbaum

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Controlvar: Exploring Controllable Visual Autoregressive Modeling, by Xiang Li et al.

Summary of Meshanything: Artist-created Mesh Generation with Autoregressive Transformers, by Yiwen Chen et al.

Related Posts