Summary of Discriminative Probing and Tuning For Text-to-image Generation, by Leigang Qu et al.
Discriminative Probing and Tuning for Text-to-Image Generationby Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang,…
Discriminative Probing and Tuning for Text-to-Image Generationby Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang,…
Transformer for Times Series: an Application to the S&P500by Pierre Brugiere, Gabriel TuriniciFirst submitted to…
Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generationby Daiqing Li, Aleks Kamko,…
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesisby Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov, Ekaterina…
The Revolution of Multimodal Large Language Models: A Surveyby Davide Caffagni, Federico Cocchi, Luca Barsellotti,…
Landmark-based Localization using Stereo Vision and Deep Learning in GPS-Denied Battlefield Environmentby Ganesh Sapkota, Sanjay…
SDiT: Spiking Diffusion Model with Transformerby Shu Yang, Hanzhi Ma, Chengting Yu, Aili Wang, Er-Ping…
Magic-Me: Identity-Specific Video Customized Diffusionby Ze Ma, Daquan Zhou, Chun-Hsiao Yeh, Xue-She Wang, Xiuyu Li,…
Can Shape-Infused Joint Embeddings Improve Image-Conditioned 3D Diffusion?by Cristian Sbrolli, Paolo Cudrano, Matteo MatteucciFirst submitted…
Spatial-Aware Latent Initialization for Controllable Image Generationby Wenqiang Sun, Teng Li, Zehong Lin, Jun ZhangFirst…