Summary of Scenetap: Scene-coherent Typographic Adversarial Planner Against Vision-language Models in Real-world Environments, by Yue Cao et al.
SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environmentsby Yue Cao, Yun Xing,…