Summary of Inserting Faces Inside Captions: Image Captioning with Attention Guided Merging, by Yannis Tevissen (armedia-samovar et al.
Inserting Faces inside Captions: Image Captioning with Attention Guided Merging
by Yannis Tevissen, Khalil Guetari, Marine Tassel, Erwan Kerleroux, Frédéric Petitpont
First submitted to arxiv on: 20 Mar 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Image and Video Processing (eess.IV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary AstroCaptions is a new dataset for image captioning that tackles the challenge of identifying people’s names in recent and archived pictures. The dataset contains thousands of public figures that are difficult to recognize using traditional models. To address this issue, we propose a novel post-processing method that uses explainable AI tools and vision-language models to insert identified people’s names into captions. Our results show significant improvements in caption quality, with up to 93.2% of detected individuals successfully inserted into captions. This leads to enhancements in BLEU, ROUGE, CIDEr, and METEOR scores for each captioning model. The proposed method has the potential to reduce hallucinations and improve image accessibility. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine being able to describe pictures with words. This is called image captioning, and it helps people access and find images easily. But right now, this technology struggles to identify important details like people’s names. In this research, we created a special dataset called AstroCaptions that contains thousands of public figures who are hard to recognize. We also developed a new way to add the identified names into captions using AI tools. Our results show that this method improves caption quality and makes it easier to find what you’re looking for. |
Keywords
» Artificial intelligence » Bleu » Image captioning » Rouge