Summary of Non-instructional Fine-tuning: Enabling Instruction-following Capabilities in Pre-trained Language Models Without Instruction-following Data, by Juncheng Xie et al.

Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models without Instruction-Following Data

by Juncheng Xie, Shensian Syu, Hung-yi Lee

First submitted to arxiv on: 27 Aug 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The authors propose a novel approach for finetuning large language models (LLMs) to follow instructions without relying on traditional supervised data. They use the first half of random text from OpenWebText as instructions and pre-trained LLMs like GPT-3.5-turbo or GPT-4-turbo to complete the text as responses. Despite this “non-instructional” approach, they find that pre-trained LLMs can still gain instruction-following capabilities after finetuning. This is demonstrated through experiments on several well-known models (e.g., LLaMA-2-7B, LLaMA-3-8B, LLaMA-3-70B, Mistral-7B-v0.1). The results show that the “non-instructional data” also improves some models that underwent supervised finetuning and human preference alignment. Specifically, the authors’ LLaMA-3-70B-Instruct model is comparable to LLaMA-3.1-70B-Instruct on the Arena Hard leaderboard.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research helps large language models learn to follow instructions without needing special data. The authors test this by giving random text from OpenWebText as “instructions” and asking pre-trained models like GPT-3.5-turbo or GPT-4-turbo to complete the text. Surprisingly, these models can still learn to follow instructions even though they don’t have any traditional instruction-related data. The authors try this approach on several different models and find that it makes them better at following instructions.

Keywords

* Artificial intelligence * Alignment * Gpt * Llama * Supervised

Non-instructional Fine-tuning: Enabling Instruction-Following Capabilities in Pre-trained Language Models without Instruction-Following Data

by Juncheng Xie, Shensian Syu, Hung-yi Lee

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Modularity in Transformers: Investigating Neuron Separability & Specialization, by Nicholas Pochinkov et al.

Summary of Negation Blindness in Large Language Models: Unveiling the No Syndrome in Image Generation, by Mohammad Nadeem et al.

Related Posts