Summary of Learning to Summarize From Llm-generated Feedback, by Hwanjun Song and Taewon Yun and Yuho Lee and Jihwan Oh and Gihun Lee and Jason Cai and Hang Su
Learning to Summarize from LLM-generated Feedback
by Hwanjun Song, Taewon Yun, Yuho Lee, Jihwan Oh, Gihun Lee, Jason Cai, Hang Su
First submitted to arxiv on: 17 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Developing effective text summarizers remains a challenge due to issues like hallucinations, key information omissions, and verbosity in LLM-generated summaries. To address this, researchers introduce FeedSum, a large-scale dataset containing multi-dimensional LLM feedback on summaries of varying quality across diverse domains. The study explores how feedback quality, dimensionality, and granularity influence preference learning, revealing that high-quality, multi-dimensional, fine-grained feedback significantly improves summary generation. Two methods for using this feedback are compared: supervised fine-tuning and direct preference optimization. Furthermore, the study introduces SummLlama3-8b, a model that outperforms Llama3-70b-instruct in generating human-preferred summaries, demonstrating that smaller models can achieve superior performance with appropriate training. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Researchers are trying to make computers better at summarizing texts. Right now, computer-generated summaries often include extra information or miss important details. To solve this problem, scientists created a large dataset of feedback on different summary quality levels across various topics. They found that high-quality feedback helps improve the quality of generated summaries. The study also compared two ways to use this feedback: one way is like training a model, and another way is like directly telling it what’s good or bad. Finally, they created a new model called SummLlama3-8b that can generate better summaries than other models. |
Keywords
» Artificial intelligence » Fine tuning » Optimization » Supervised