Summary of Mini-gemini: Mining the Potential Of Multi-modality Vision Language Models, by Yanwei Li et al.
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Modelsby Yanwei Li, Yuechen Zhang, Chengyao Wang,…
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Modelsby Yanwei Li, Yuechen Zhang, Chengyao Wang,…
Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security…
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs’ Gaming Ability in Multi-Agent…
How Well Do Multi-modal LLMs Interpret CT Scans? An Auto-Evaluation Framework for Analysesby Qingqing Zhu,…
Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoningby Deepanway Ghosal,…
Can Large Language Models do Analytical Reasoning?by Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang,…
LLMs in Political Science: Heralding a New Era of Visual Analysisby Yu WangFirst submitted to…
GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluationby Yi Zong, Xipeng QiuFirst submitted to…
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMsby Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen…
CyberMetric: A Benchmark Dataset based on Retrieval-Augmented Generation for Evaluating LLMs in Cybersecurity Knowledgeby Norbert…