Summary of Uncovering Bias in Large Vision-language Models with Counterfactuals, by Phillip Howard et al.
Uncovering Bias in Large Vision-Language Models with Counterfactuals
by Phillip Howard, Anahita Bhiwandiwalla, Kathleen C. Fraser, Svetlana Kiritchenko
First submitted to arxiv on: 29 Mar 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Large Vision-Language Models (LVLMs) have been developed to combine visual inputs with text prompts, enabling applications like visual question answering. While social biases in Large Language Models (LLMs) have been studied, this area has been underexplored for LVLMs due to the complexities of bias across modalities. Our study addresses this challenge by examining text generated by LVLMs under counterfactual changes to input images with identical open-ended prompts. We find that social attributes like race and gender depicted in images significantly influence toxicity and competency-related word generation. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine a computer program that can understand and respond to both written words and pictures. This program, called an LVLM, is important for tasks like answering questions about images. Some people are concerned that these programs may have biases against certain groups of people. Our research looks at how well different LVLMs do when shown pictures with different characteristics, such as a doctor who is male or female. We found that the program’s responses can be influenced by the characteristics it sees in the picture. |
Keywords
» Artificial intelligence » Question answering