Multi modal – Page 3 – GrooveSquid.com

July 13, 2025

From Noise to Nuance: Advances in Deep Generative Image Modelsby Benji Peng, Chia Xin Liang,…

July 13, 2025

Is Contrastive Distillation Enough for Learning Comprehensive 3D Representations?by Yifan Zhang, Junhui HouFirst submitted to…

July 13, 2025

MAGIC: Mastering Physical Adversarial Generation in Context through Collaborative LLM Agentsby Yun Xing, Nhat Chung,…

July 13, 2025

ContRail: A Framework for Realistic Railway Image Synthesis using ControlNetby Andrei-Robert Alexandrescu, Razvan-Gabriel Petec, Alexandru…

July 13, 2025

TeamCraft: A Benchmark for Multi-Modal Multi-Agent Systems in Minecraftby Qian Long, Zhi Li, Ran Gong,…

July 13, 2025

BodyMetric: Evaluating the Realism of Human Bodies in Text-to-Image Generationby Nefeli Andreou, Varsha Vivek, Ying…

July 13, 2025

Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Expertsby Chenyang Zhu,…

July 13, 2025

MIND: Effective Incorrect Assignment Detection through a Multi-Modal Structure-Enhanced Language Modelby Yunhe Pang, Bo Chen,…

July 13, 2025

SocialMind: LLM-based Proactive AR Social Assistive System with Human-like Perception for In-situ Live Interactionsby Bufang…

July 13, 2025

ProtDAT: A Unified Framework for Protein Sequence Design from Any Protein Text Descriptionby Xiao-Yu Guo,…