Summary of Visreas: Complex Visual Reasoning with Unanswerable Questions, by Syeda Nahida Akter et al.
VISREAS: Complex Visual Reasoning with Unanswerable Questions
by Syeda Nahida Akter, Sangwu Lee, Yingshan Chang, Yonatan Bisk, Eric Nyberg
First submitted to arxiv on: 23 Feb 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a new compositional visual question-answering dataset, VISREAS, which consists of 2.07M semantically diverse queries generated automatically using Visual Genome scene graphs. The unique feature of this task is validating question answerability with respect to an image before answering, which inspires the design of a new modular baseline, LOGIC2VISION that reasons by producing and executing pseudocode without any external modules to generate the answer. The proposed model outperforms generative models in VISREAS (+4.82% over LLaVA-1.5; +12.23% over InstructBLIP) and achieves a significant gain in performance against classification models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper creates a new way for computers to understand and answer questions about pictures. It makes 2 million questions that are hard or impossible to answer, and tests different ways of answering them. The best method uses a special code-making system to figure out the right answer. This approach works better than others in this task. |
Keywords
» Artificial intelligence » Classification » Question answering