Summary of Connect, Collapse, Corrupt: Learning Cross-modal Tasks with Uni-modal Data, by Yuhui Zhang et al.
Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Databy Yuhui Zhang, Elaine Sui, Serena Yeung-LevyFirst…
Connect, Collapse, Corrupt: Learning Cross-Modal Tasks with Uni-Modal Databy Yuhui Zhang, Elaine Sui, Serena Yeung-LevyFirst…
WAVES: Benchmarking the Robustness of Image Watermarksby Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal,…
MultiPLY: A Multisensory Object-Centric Embodied Large Language Model in 3D Worldby Yining Hong, Zishuo Zheng,…
Temporal Embeddings: Scalable Self-Supervised Temporal Representation Learning from Spatiotemporal Data for Multimodal Computer Visionby Yi…
Nahid: AI-based Algorithm for operating fully-automatic surgeryby Sina SaadatiFirst submitted to arxiv on: 3 Nov…
MATE-Pred: Multimodal Attention-based TCR-Epitope interaction Predictorby Etienne Goffinet, Raghvendra Mall, Ankita Singh, Rahul Kaushik, Filippo…
One-Step Diffusion Distillation via Deep Equilibrium Modelsby Zhengyang Geng, Ashwini Pokle, J. Zico KolterFirst submitted…
SAiD: Speech-driven Blendshape Facial Animation with Diffusionby Inkyu Park, Jaewoong ChoFirst submitted to arxiv on:…
An Integrated Imitation and Reinforcement Learning Methodology for Robust Agile Aircraft Control with Limited Pilot…
Concept Alignmentby Sunayana Rane, Polyphony J. Bruna, Ilia Sucholutsky, Christopher Kello, Thomas L. GriffithsFirst submitted…