Summary of Black-box Model Ensembling For Textual and Visual Question Answering Via Information Fusion, by Yuxi Xia et al.
Black-box Model Ensembling for Textual and Visual Question Answering via Information Fusion
by Yuxi Xia, Kilm Zaporojets, Benjamin Roth
First submitted to arxiv on: 4 Jul 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Medium Difficulty Summary: A novel ensemble method, InfoSel, is proposed for solving textual and visual question answering tasks. Large language models (LLMs) like ChatGPT and visual question answering (VQA) models such as BLIP are fine-tuned to address black-box limitations. Unlike traditional methods, InfoSel does not rely on prediction probabilities or confidences, which are often unavailable in black-box models. Experimental results show an absolute increase of up to +5.19% in the F1-score compared to standalone LLMs using only 1K training instances. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Low Difficulty Summary: This paper introduces a new way to solve question answering tasks using existing language models like ChatGPT and image models. The problem is that these models are hard to work with because they’re “black boxes” that don’t give us enough information. Our solution, called InfoSel, helps by picking the best answer from multiple models without needing their secrets. We tested it on several datasets and found that it works better than just using one model alone, especially when we only have a small amount of training data. |
Keywords
* Artificial intelligence * F1 score * Question answering