Summary of Text-guided Attention Is All You Need For Zero-shot Robustness in Vision-language Models, by Lu Yu et al.
Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Modelsby Lu Yu, Haiyang…
Text-Guided Attention is All You Need for Zero-Shot Robustness in Vision-Language Modelsby Lu Yu, Haiyang…
Diffusion as Reasoning: Enhancing Object Goal Navigation with LLM-Biased Diffusion Modelby Yiming Ji, Yang Liu,…
Building Altruistic and Moral AI Agent with Brain-inspired Affective Empathy Mechanismsby Feifei Zhao, Hui Feng,…
Going Beyond H&E and Oncology: How Do Histopathology Foundation Models Perform for Multi-stain IHC and…
ImageNet-RIB Benchmark: Large Pre-Training Datasets Don’t Always Guarantee Robustness after Fine-Tuningby Jaedong Hwang, Brian Cheung,…
Open-Vocabulary Object Detection via Language Hierarchyby Jiaxing Huang, Jingyi Zhang, Kai Jiang, Shijian LuFirst submitted…
Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphsby Xingrui Zhuo, Jiapu…
AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistantby Chengyou Jia, Minnan Luo,…
Unsupervised Domain Adaptation for Action Recognition via Self-Ensembling and Conditional Embedding Alignmentby Indrajeet Ghosh, Garvit…
Improve Vision Language Model Chain-of-thought Reasoningby Ruohong Zhang, Bowen Zhang, Yanghao Li, Haotian Zhang, Zhiqing…