Summary of Explainable Image Captioning Using Cnn- Cnn Architecture and Hierarchical Attention, by Rishi Kesav Mohan et al.
Explainable Image Captioning using CNN- CNN architecture and Hierarchical Attentionby Rishi Kesav Mohan, Sanjay Sureshkumar,…
Explainable Image Captioning using CNN- CNN architecture and Hierarchical Attentionby Rishi Kesav Mohan, Sanjay Sureshkumar,…
Pseudo-RIS: Distinctive Pseudo-supervision Generation for Referring Image Segmentationby Seonghoon Yu, Paul Hongsuck Seo, Jeany SonFirst…
RAVEN: Multitask Retrieval Augmented Vision-Language Learningby Varun Nagaraj Rao, Siddharth Choudhary, Aditya Deshpande, Ravi Kumar…
Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?by Mingqian Feng, Yunlong Tang,…
OSPC: Detecting Harmful Memes with Large Language Model as a Catalystby Jingtao Cao, Zheng Zhang,…
From Redundancy to Relevance: Information Flow in LVLMs Across Reasoning Tasksby Xiaofeng Zhang, Yihao Quan,…
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Modelby Yebin…
Text-only Synthesis for Image Captioningby Qing Zhou, Junlin Huang, Qiang Li, Junyu Gao, Qi WangFirst…
Class-Conditional self-reward mechanism for improved Text-to-Image modelsby Safouane El Ghazouali, Arnaud Gucciardi, Umberto MichelucciFirst submitted…
Towards Retrieval-Augmented Architectures for Image Captioningby Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita…