Summary of Phd: a Chatgpt-prompted Visual Hallucination Evaluation Dataset, by Jiazhen Liu et al.

PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset

by Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

First submitted to arxiv on: 17 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper proposes the ChatGPT-Prompted visual hallucination evaluation Dataset (PhD), a large-scale dataset designed to evaluate the susceptibility of Multimodal Large Language Models (MLLMs) to visual hallucinations. The PhD dataset is structured along two dimensions: task and mode. Five visual recognition tasks, ranging from low-level object/attribute recognition to middle-level sentiment/position recognition and counting, are considered. Additionally, three question modes are introduced: normal visual QA, inaccurate context, and incorrect context, as well as AI-generated counter common sense images. The dataset is constructed using a ChatGPT-assisted semi-automated pipeline, comprising four modules: task-specific hallucinatory item selection, hitem-embedded question generation, inaccurate/incorrect context generation, and counter-common-sense image generation. With over 14k daily images, 750 CCS images, and 102k VQA triplets in total, the PhD dataset reveals significant variability in MLLMs’ performance across various modes and tasks, offering valuable insights into the nature of hallucination.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research paper creates a big database called PhD to help us understand how certain AI models can make mistakes when looking at pictures. These AI models are good at understanding text, but they’re not as good at understanding images. The PhD database has many different types of questions and answers about the images, including some that are tricky or misleading. By studying this dataset, we can learn more about how these AI models work and what makes them make mistakes. This is important because it could help us improve these AI models so they’re better at understanding images.

Keywords

* Artificial intelligence * Hallucination * Image generation

PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset

by Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Exploring Chinese Humor Generation: a Study on Two-part Allegorical Sayings, by Rongwu Xu

Summary of Cantonmt: Cantonese to English Nmt Platform with Fine-tuned Models Using Synthetic Back-translation Data, by Kung Yin Hong et al.

Related Posts