Summary of Hallucination Detox: Sensitivity Dropout (send) For Large Language Model Training, by Shahrad Mohammadzadeh et al.
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training
by Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi
First submitted to arxiv on: 20 Oct 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL); Spectral Theory (math.SP)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary As large language models (LLMs) gain widespread adoption, concerns about their reliability have grown due to hallucinations – inaccurate or irrelevant outputs. Our research investigates the link between training processes and hallucination emergence, focusing on a gap in existing research that primarily addresses post-hoc detection and mitigation strategies. We analyze hallucination trends throughout training using Pythia suite models (70M-12B parameters) and several metrics. We introduce Sensitivity Dropout (SenD), a novel protocol reducing variance during training by deterministically dropping Sensitive Embedding Indices with significant variability. SenD achieves this at 2x speed through Efficient EigenScore (EES), an unsupervised hallucination detection metric. Our empirical evaluation demonstrates that SenD improves LLM reliability at test time by up to 40% compared to normal training, while also enhancing factual accuracy in Wikipedia, Medical, and LegalBench domains. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models are getting better at understanding and generating human-like text, but they’re not perfect. Sometimes they produce information that’s just plain wrong! This paper looks into why this happens and how we can stop it from happening as much. The researchers used special computer programs called Pythia suite to test their ideas. They found a way to reduce these errors by tweaking the way the language models are trained. This new method is called Sensitivity Dropout, or SenD for short. It works by getting rid of tiny pieces of information that can cause mistakes. The researchers also created a new tool to help detect when the model is producing nonsense. They tested this method on three different areas: Wikipedia, Medical, and LegalBench. The results show that using SenD makes the language models 40% more reliable than usual. |
Keywords
» Artificial intelligence » Dropout » Embedding » Hallucination » Unsupervised