Multi modal – Page 30 – GrooveSquid.com

July 13, 2025

LLaVA-Zip: Adaptive Visual Token Compression with Intrinsic Image Informationby Ke Wang, Hong XuanFirst submitted to…

July 13, 2025

GenPlan: Generative Sequence Models as Adaptive Plannersby Akash Karthikeyan, Yash Vardhan PantFirst submitted to arxiv…

July 13, 2025

Seeing Syntax: Uncovering Syntactic Learning Limitations in Vision-Language Modelsby Sri Harsha Dumpala, David Arps, Sageev…

July 13, 2025

Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language Modelsby Quang-Hung Le, Long Hoang Dang,…

July 13, 2025

Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision?by Zihao Li, Lecheng Zheng,…

July 13, 2025

AmCLR: Unified Augmented Learning for Cross-Modal Representationsby Ajay Jagannath, Aayush Upadhyay, Anant MehtaFirst submitted to…

July 13, 2025

Anomaly detection using Diffusion-based methodsby Aryan Bhosale, Samrat Mukherjee, Biplab Banerjee, Fabio CuzzolinFirst submitted to…

July 13, 2025

MM-PoE: Multiple Choice Reasoning via. Process of Elimination using Multi-Modal Modelsby Sayak Chakrabarty, Souradip PalFirst…

July 13, 2025

In-Application Defense Against Evasive Web Scans through Behavioral Analysisby Behzad Ousat, Mahshad Shariatnasab, Esteban Schafir,…

July 13, 2025

Can foundation models actively gather information in interactive environments to test hypotheses?by Nan Rosemary Ke,…