Summary of Surgical-llava: Toward Surgical Scenario Understanding Via Large Language and Vision Models, by Juseong Jin et al.
Surgical-LLaVA: Toward Surgical Scenario Understanding via Large Language and Vision Modelsby Juseong Jin, Chang Wook…
Surgical-LLaVA: Toward Surgical Scenario Understanding via Large Language and Vision Modelsby Juseong Jin, Chang Wook…
ChartKG: A Knowledge-Graph-Based Representation for Chart Imagesby Zhiguang Zhou, Haoxuan Wang, Zhengqing Zhao, Fengling Zheng,…
Zero-shot Commonsense Reasoning over Machine Imaginationby Hyuntae Park, Yeachan Kim, Jun-Hyung Park, SangKeun LeeFirst submitted…
Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answeringby Ting Yu, Kunhao Fu,…
Prompting Video-Language Foundation Models with Domain-specific Fine-grained Heuristics for Video Question Answeringby Ting Yu, Kunhao…
Declarative Knowledge Distillation from Large Language Models for Visual Question Answering Datasetsby Thomas Eiter, Jan…
MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Modelsby Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou, Mohsen Fayyaz,…
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preferenceby William Thorne,…
Rewriting Conversational Utterances with Instructed Large Language Modelsby Elnara Galimzhanova, Cristina Ioana Muntean, Franco Maria…
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question…