Summary of Adversarial Training with Ocr Modality Perturbation For Scene-text Visual Question Answering, by Zhixuan Shen et al.
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answeringby Zhixuan Shen, Haonan Luo,…
Adversarial Training with OCR Modality Perturbation for Scene-Text Visual Question Answeringby Zhixuan Shen, Haonan Luo,…
Language-Driven Visual Consensus for Zero-Shot Semantic Segmentationby Zicheng Zhang, Tong Zhang, Yi Zhu, Jianzhuang Liu,…
Gabor-guided transformer for single image derainingby Sijin He, Guangfeng LinFirst submitted to arxiv on: 12…
Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baselineby Xiao Wang, Ju Huang, Shiao Wang, Chuanming…
StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Modelsby Lezhong Wang, Jeppe Revall Frisvad, Mark…
PrimeComposer: Faster Progressively Combined Diffusion for Image Composition with Attention Steeringby Yibin Wang, Weizhong Zhang,…
Modality-Aware and Shift Mixer for Multi-modal Brain Tumor Segmentationby Zhongzhen Huang, Linda Wei, Shaoting Zhang,…
Region-Transformer: Self-Attention Region Based Class-Agnostic Point Cloud Segmentationby Dipesh Gyawali, Jian Zhang, BB KarkiFirst submitted…
GLFNET: Global-Local (frequency) Filter Networks for efficient medical image segmentationby Athanasios Tragakis, Qianying Liu, Chaitanya…
PIDformer: Transformer Meets Control Theoryby Tam Nguyen, César A. Uribe, Tan M. Nguyen, Richard G.…