Loading Now

Summary of Visual Hallucination: Definition, Quantification, and Prescriptive Remediations, by Anku Rani et al.


Visual Hallucination: Definition, Quantification, and Prescriptive Remediations

by Anku Rani, Vipula Rawte, Harshad Sharma, Neeraj Anand, Krishnav Rajbangshi, Amit Sheth, Amitava Das

First submitted to arxiv on: 26 Mar 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper investigates hallucination in Vision-Language models (VLMs) by profiling it through two tasks: image captioning and Visual Question Answering. It identifies eight fine-grained orientations of visual hallucination, including contextual guessing, identity incongruity, geographical erratum, and more. To study this phenomenon, the authors create a publicly available dataset called VHILT, comprising 2,000 samples generated using eight VLMs across both tasks, along with human annotations for each category.
Low GrooveSquid.com (original content) Low Difficulty Summary
Hallucination in AI is a big problem that makes it hard to trust machines. Researchers have been trying to solve this issue in language models, but they haven’t looked at visual models yet. This paper looks at how visual models hallucinate and what kinds of mistakes they make. It finds eight different types of mistakes, like guessing wrong or describing things that aren’t there. To help others study this problem, the authors create a big dataset with many examples of these mistakes.

Keywords

» Artificial intelligence  » Hallucination  » Image captioning  » Question answering