Summary of Visual Hallucination: Definition, Quantification, and Prescriptive Remediations, by Anku Rani et al.

Visual Hallucination: Definition, Quantification, and Prescriptive Remediations

by Anku Rani, Vipula Rawte, Harshad Sharma, Neeraj Anand, Krishnav Rajbangshi, Amit Sheth, Amitava Das

First submitted to arxiv on: 26 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates hallucination in Vision-Language models (VLMs) by profiling it through two tasks: image captioning and Visual Question Answering. It identifies eight fine-grained orientations of visual hallucination, including contextual guessing, identity incongruity, geographical erratum, and more. To study this phenomenon, the authors create a publicly available dataset called VHILT, comprising 2,000 samples generated using eight VLMs across both tasks, along with human annotations for each category.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Hallucination in AI is a big problem that makes it hard to trust machines. Researchers have been trying to solve this issue in language models, but they haven’t looked at visual models yet. This paper looks at how visual models hallucinate and what kinds of mistakes they make. It finds eight different types of mistakes, like guessing wrong or describing things that aren’t there. To help others study this problem, the authors create a big dataset with many examples of these mistakes.

Keywords

* Artificial intelligence * Hallucination * Image captioning * Question answering

Visual Hallucination: Definition, Quantification, and Prescriptive Remediations

by Anku Rani, Vipula Rawte, Harshad Sharma, Neeraj Anand, Krishnav Rajbangshi, Amit Sheth, Amitava Das

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Quantemp: a Real-world Open-domain Benchmark For Fact-checking Numerical Claims, by Venktesh V et al.

Summary of Equipping Sketch Patches with Context-aware Positional Encoding For Graphic Sketch Representation, by Sicong Zang et al.

Related Posts