Multi modal – Page 31 – GrooveSquid.com

July 13, 2025

LinVT: Empower Your Image-level Large Language Model to Understand Videosby Lishuai Gao, Yujie Zhong, Yingsen…

July 13, 2025

Leveraging Multimodal Protein Representations to Predict Protein Melting Temperaturesby Daiheng Zhang, Yan Zeng, Xinyu Hong,…

July 13, 2025

Enhancing CLIP Conceptual Embedding through Knowledge Distillationby Kuei-Chun KaoFirst submitted to arxiv on: 4 Dec…

July 13, 2025

Learning on One Mode: Addressing Multi-modality in Offline Reinforcement Learningby Mianchu Wang, Yue Jin, Giovanni…

July 13, 2025

WxC-Bench: A Novel Dataset for Weather and Climate Downstream Tasksby Rajat Shinde, Christopher E. Phillips,…

July 13, 2025

Visual Error Patterns in Multi-Modal AI: A Statistical Approachby Ching-Yi WangFirst submitted to arxiv on:…

July 13, 2025

ElectroVizQA: How well do Multi-modal LLMs perform in Electronics Visual Question Answering?by Pragati Shuddhodhan Meshram,…

July 13, 2025

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videosby Tiantian Geng, Jinrui Zhang, Qingni…

July 13, 2025

CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectivesby Armin Saghafian, Amirmohammad Izadi, Negin Hashemi Dijujin,…

July 13, 2025

MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasksby Yiming Wu, Wei Ji, Kecheng Zheng,…