Summary of Mc-llava: Multi-concept Personalized Vision-language Model, by Ruichuan An et al.
MC-LLaVA: Multi-Concept Personalized Vision-Language Modelby Ruichuan An, Sihan Yang, Ming Lu, Kai Zeng, Yulin Luo,…
MC-LLaVA: Multi-Concept Personalized Vision-Language Modelby Ruichuan An, Sihan Yang, Ming Lu, Kai Zeng, Yulin Luo,…
Unveiling the Hidden: Online Vectorized HD Map Construction with Clip-Level Token Interaction and Propagationby Nayeon…
A Survey on Vision Autoregressive Modelby Kai Jiang, Jiaxing HuangFirst submitted to arxiv on: 13…
R3HF: Reward Redistribution for Enhancing Reinforcement Learning from Human Feedbackby Jiahui Li, Tai-wei Chang, Fengda…
Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAGby Zilun Zhang, Haozhan Shen, Tiancheng…
TreeCoders: Trees of Transformersby Pierre Colonna D'Istria, Abdulrahman AltahhanFirst submitted to arxiv on: 11 Nov…
Target-driven Attack for Large Language Modelsby Chong Zhang, Mingyu Jin, Dong Shu, Taowen Wang, Dongfang…
The Backpropagation of the Wave Networkby Xin Zhang, Victor S. ShengFirst submitted to arxiv on:…
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesisby Zanlin Ni, Yulin Wang, Renping Zhou, Yizeng…
Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesisby Taihang Hu, Linxuan Li, Joost van…