Summary of Ms2sl: Multimodal Spoken Data-driven Continuous Sign Language Production, by Jian Ma et al.
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Productionby Jian Ma, Wenguan Wang, Yi Yang, Feng…
MS2SL: Multimodal Spoken Data-Driven Continuous Sign Language Productionby Jian Ma, Wenguan Wang, Yi Yang, Feng…
NutriBench: A Dataset for Evaluating Large Language Models on Nutrition Estimation from Meal Descriptionsby Andong…
Multimodal Reranking for Knowledge-Intensive Visual Question Answeringby Haoyang Wen, Honglei Zhuang, Hamed Zamani, Alexander Hauptmann,…
Any Target Can be Offense: Adversarial Example Generation via Generalized Latent Infectionby Youheng Sun, Shengming…
I2AM: Interpreting Image-to-Image Latent Diffusion Models via Bi-Attribution Mapsby Junseo Park, Hyeryung JangFirst submitted to…
ModalChorus: Visual Probing and Alignment of Multi-modal Embeddings via Modal Fusion Mapby Yilin Ye, Shishi…
Evaluating graph-based explanations for AI-based recommender systemsby Simon Delarue, Astrid Bertrand, Tiphaine ViardFirst submitted to…
NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Modelsby Gengze Zhou, Yicong Hong, Zun Wang,…
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objectsby Xintao Lv, Liang Xu,…
PersLLM: A Personified Training Approach for Large Language Modelsby Zheni Zeng, Jiayi Chen, Huimin Chen,…