Loading Now

Summary of A Survey on Hallucination in Large Vision-language Models, by Hanchao Liu and Wenyuan Xue and Yifei Chen and Dapeng Chen and Xiutian Zhao and Ke Wang and Liping Hou and Rongjun Li and Wei Peng


A Survey on Hallucination in Large Vision-Language Models

by Hanchao Liu, Wenyuan Xue, Yifei Chen, Dapeng Chen, Xiutian Zhao, Ke Wang, Liping Hou, Rongjun Li, Wei Peng

First submitted to arxiv on: 1 Feb 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Computation and Language (cs.CL); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The recent advancements in Large Vision-Language Models (LVLMs) have sparked significant interest in the AI community, but their practical implementation is hindered by the issue of “hallucination” – a misalignment between factual visual content and corresponding textual generation. This comprehensive survey aims to provide an overview and facilitate future mitigation efforts by dissecting LVLM-related hallucinations. The paper clarifies the concept of hallucinations in LVLMs, highlighting various symptoms and challenges. It also outlines benchmarks and methodologies for evaluating these unique hallucinations and investigates their root causes, including insights from training data and model components. Furthermore, the survey critically reviews existing methods for mitigating hallucinations.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large Vision-Language Models have made big progress, but they can get confused about what’s real or not. This is called “hallucination.” The problem is that LVLMs are really good at making text based on what they see, but sometimes they make things up! This survey looks at why this happens and how to fix it.

Keywords

* Artificial intelligence  * Hallucination