Summary of Discriminative Fine-tuning Of Lvlms, by Yassine Ouali et al.
Discriminative Fine-tuning of LVLMsby Yassine Ouali, Adrian Bulat, Alexandros Xenos, Anestis Zaganidis, Ioannis Maniadis Metaxas,…
Discriminative Fine-tuning of LVLMsby Yassine Ouali, Adrian Bulat, Alexandros Xenos, Anestis Zaganidis, Ioannis Maniadis Metaxas,…
DroidCall: A Dataset for LLM-powered Android Intent Invocationby Weikai Xie, Li Zhang, Shihe Wang, Rongjie…
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentationby Luca Barsellotti, Lorenzo…
Large Language Model-Brained GUI Agents: A Surveyby Chaoyun Zhang, Shilin He, Jiaxu Qian, Bowen Li,…
Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documentsby Jun Chen, Dannong Xu, Junjie Fei,…
Neurosymbolic Graph Enrichment for Grounded World Modelsby Stefano De Giorgis, Aldo Gangemi, Alessandro RussoFirst submitted…
Zero-shot Cross-lingual Transfer Learning with Multiple Source and Target Languages for Information Extraction: Language Selection…
HierTOD: A Task-Oriented Dialogue System Driven by Hierarchical Goalsby Lingbo Mo, Shun Jiang, Akash Maharaj,…
Watson: A Cognitive Observability Framework for the Reasoning of LLM-Powered Agentsby Benjamin Rombaut, Sogol Masoumzadeh,…
HumanVLM: Foundation for Human-Scene Vision-Language Modelby Dawei Dai, Xu Long, Li Yutang, Zhang Yuanhui, Shuyin…