Loading Now

Summary of Phd: a Chatgpt-prompted Visual Hallucination Evaluation Dataset, by Jiazhen Liu et al.


PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset

by Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

First submitted to arxiv on: 17 Mar 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes the ChatGPT-Prompted visual hallucination evaluation Dataset (PhD), a large-scale dataset designed to evaluate the susceptibility of Multimodal Large Language Models (MLLMs) to visual hallucinations. The PhD dataset is structured along two dimensions: task and mode. Five visual recognition tasks, ranging from low-level object/attribute recognition to middle-level sentiment/position recognition and counting, are considered. Additionally, three question modes are introduced: normal visual QA, inaccurate context, and incorrect context, as well as AI-generated counter common sense images. The dataset is constructed using a ChatGPT-assisted semi-automated pipeline, comprising four modules: task-specific hallucinatory item selection, hitem-embedded question generation, inaccurate/incorrect context generation, and counter-common-sense image generation. With over 14k daily images, 750 CCS images, and 102k VQA triplets in total, the PhD dataset reveals significant variability in MLLMs’ performance across various modes and tasks, offering valuable insights into the nature of hallucination.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research paper creates a big database called PhD to help us understand how certain AI models can make mistakes when looking at pictures. These AI models are good at understanding text, but they’re not as good at understanding images. The PhD database has many different types of questions and answers about the images, including some that are tricky or misleading. By studying this dataset, we can learn more about how these AI models work and what makes them make mistakes. This is important because it could help us improve these AI models so they’re better at understanding images.

Keywords

» Artificial intelligence  » Hallucination  » Image generation