Summary of A Survey Of Hallucination in Large Visual Language Models, by Wei Lan et al.

A Survey of Hallucination in Large Visual Language Models

by Wei Lan, Wenyi Chen, Qingfeng Chen, Shirui Pan, Huiyu Zhou, Yi Pan

First submitted to arxiv on: 20 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The Large Visual Language Models (LVLMs) integrate visual modality with Large Language Models (LLMs), enhancing user interaction and enriching user experience. LVLMs have demonstrated powerful information processing and generation capabilities, but hallucinations limit their potential and practical effectiveness. This survey reviews recent works on hallucination correction and mitigation, introducing the background of LVLMs and hallucinations, main causes of hallucination generation, and available evaluation benchmarks from judgmental and generative perspectives.
Low	GrooveSquid.com (original content)	Low Difficulty Summary LVLMs combine visual and language processing to improve user interaction and experience. They can process and generate information well, but sometimes make things up (hallucinate). This makes it hard to trust them in real-life applications. This survey looks at how researchers are trying to fix this problem by correcting or mitigating hallucinations. It also explains why LVLMs work the way they do and what measures are used to test their reliability.

Keywords

* Artificial intelligence * Hallucination

A Survey of Hallucination in Large Visual Language Models

by Wei Lan, Wenyi Chen, Qingfeng Chen, Shirui Pan, Huiyu Zhou, Yi Pan

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Medical-gat: Cancer Document Classification Leveraging Graph-based Residual Network For Scenarios with Limited Data, by Elias Hossain et al.

Summary of Unveiling and Consulting Core Experts in Retrieval-augmented Moe-based Llms, by Xin Zhou et al.

Related Posts