Summary of All in An Aggregated Image For In-image Learning, by Lei Wang et al.
All in an Aggregated Image for In-Image Learningby Lei Wang, Wanyu Xu, Zhiqiang Hu, Yihuai…
All in an Aggregated Image for In-Image Learningby Lei Wang, Wanyu Xu, Zhiqiang Hu, Yihuai…
VCD: Knowledge Base Guided Visual Commonsense Discovery in Imagesby Xiangqing Shen, Yurun Song, Siwei Wu,…
Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategiesby Flavio Petruzzellis, Alberto Testolin,…
Automated Floodwater Depth Estimation Using Large Multimodal Model for Rapid Flood Mappingby Temitope Akinboyewa, Huan…
CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Modelsby Huijie Lv, Xiao Wang, Yuansen Zhang,…
From Text to Transformation: A Comprehensive Review of Large Language Models’ Versatilityby Pravneet Kaur, Gautam…
GAOKAO-MM: A Chinese Human-Level Benchmark for Multimodal Models Evaluationby Yi Zong, Xipeng QiuFirst submitted to…
TV-SAM: Increasing Zero-Shot Segmentation Performance on Multimodal Medical Images Using GPT-4 Generated Descriptive Prompts Without…
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Modelsby Haoran…
PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chainby Liang Chen, Yichi Zhang, Shuhuai Ren,…