Summary of Sub-goal Distillation: a Method to Improve Small Language Agents, by Maryam Hashemzadeh et al.
Sub-goal Distillation: A Method to Improve Small Language Agents
by Maryam Hashemzadeh, Elias Stengel-Eskin, Sarath Chandar, Marc-Alexandre Cote
First submitted to arxiv on: 4 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed method transfers the performance of a large language model (LLM) with billions of parameters to a much smaller language model (770M parameters), enabling efficient decision-making in long-horizon interactive tasks. The hierarchical agent consists of a planning module that learns through Knowledge Distillation from an LLM to generate sub-goals, and an execution module that learns to accomplish these sub-goals using elementary actions. By leveraging an LLM for oracle path annotation and fine-tuning both modules, the approach reduces the overall cost associated with LLM interactions to a fixed cost. In ScienceWorld, a challenging interactive text environment, the method surpasses standard imitation learning by 16.7% (absolute). The analysis highlights the efficiency of the approach compared to other LLM-based methods. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary A new way to make computers understand and respond to language has been developed. This system is more efficient than others because it uses a smaller “brain” that can still learn from a much larger one. The brain is divided into two parts: one that makes plans and another that takes actions. By training these parts separately, the system reduces the need for interactions with the large brain, making it faster and cheaper to use. This new method performs better than other methods in a challenging language-based game called ScienceWorld. |
Keywords
» Artificial intelligence » Fine tuning » Knowledge distillation » Language model » Large language model