Summary of A Text-to-game Engine For Ugc-based Role-playing Games, by Lei Zhang et al.
A Text-to-Game Engine for UGC-Based Role-Playing Gamesby Lei Zhang, Xuezheng Peng, Shuyi Yang, Feiyang WangFirst…
A Text-to-Game Engine for UGC-Based Role-Playing Gamesby Lei Zhang, Xuezheng Peng, Shuyi Yang, Feiyang WangFirst…
Fuse, Reason and Verify: Geometry Problem Solving with Parsed Clauses from Diagramby Ming-Liang Zhang, Zhong-Zhi…
CEIA: CLIP-Based Event-Image Alignment for Open-World Event-Based Understandingby Wenhao Xu, Wenming Weng, Yueyi Zhang, Zhiwei…
Infer Induced Sentiment of Comment Response to Video: A New Task, Dataset and Baselineby Qi…
VIMI: Grounding Video Generation through Multi-modal Instructionby Yuwei Fang, Willi Menapace, Aliaksandr Siarohin, Tsai-Shien Chen,…
Contrastive Learning of Preferences with a Contextual InfoNCE Lossby Timo Bertram, Johannes Fürnkranz, Martin MüllerFirst…
BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Spaceby Yumeng Zhang,…
TransMA: an explainable multi-modal deep learning model for predicting properties of ionizable lipid nanoparticles in…
MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planningby Min Zhang,…
MMedAgent: Learning to Use Medical Tools with Multi-modal Agentby Binxu Li, Tiankai Yan, Yuanting Pan,…