Summary of Downstream-pretext Domain Knowledge Traceback For Active Learning, by Beichen Zhang et al.
Downstream-Pretext Domain Knowledge Traceback for Active Learningby Beichen Zhang, Liang Li, Zheng-Jun Zha, Jiebo Luo,…
Downstream-Pretext Domain Knowledge Traceback for Active Learningby Beichen Zhang, Liang Li, Zheng-Jun Zha, Jiebo Luo,…
LookupViT: Compressing visual information to a limited number of tokensby Rajat Koner, Gagan Jain, Prateek…
CIC-BART-SSA: Controllable Image Captioning with Structured Semantic Augmentationby Kalliopi Basioti, Mohamed A. Abdelsalam, Federico Fancellu,…
MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignmentby Jihao Liu, Xin Huang, Jinliang Zheng,…
Low-Rank Similarity Mining for Multimodal Dataset Distillationby Yue Xu, Zhilin Lin, Yusong Qiu, Cewu Lu,…
How Culturally Aware are Vision-Language Models?by Olena Burda-Lassen, Aman Chadha, Shashank Goswami, Vinija JainFirst submitted…
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusionby Zehan Wang, Ziang Zhang, Xize…
Beyond Human Vision: The Role of Large Vision Language Models in Microscope Image Analysisby Prateek…
FLoRA: Enhancing Vision-Language Models with Parameter-Efficient Federated Learningby Duy Phuong Nguyen, J. Pablo Munoz, Ali…
Bridging Vision and Language Spaces with Assignment Predictionby Jungin Park, Jiyoung Lee, Kwanghoon SohnFirst submitted…