Summary of Emma: Efficient Visual Alignment in Multi-modal Llms, by Sara Ghazanfari et al.
EMMA: Efficient Visual Alignment in Multi-Modal LLMsby Sara Ghazanfari, Alexandre Araujo, Prashanth Krishnamurthy, Siddharth Garg,…
EMMA: Efficient Visual Alignment in Multi-Modal LLMsby Sara Ghazanfari, Alexandre Araujo, Prashanth Krishnamurthy, Siddharth Garg,…
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusionby Dexuan Ding, Lei Wang, Liyun Zhu,…
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMsby Hong Li, Nanxi Li,…
Transferable Unsupervised Outlier Detection Framework for Human Semantic Trajectoriesby Zheng Zhang, Hossein Amiri, Dazhou Yu,…
M2Distill: Multi-Modal Distillation for Lifelong Imitation Learningby Kaushik Roy, Akila Dissanayake, Brendan Tidd, Peyman MoghadamFirst…
Characterizing and Efficiently Accelerating Multimodal Generation Model Inferenceby Yejin Lee, Anna Sun, Basil Hosmer, Bilge…
Supervised Multi-Modal Fission Learningby Lingchao Mao, Qi wang, Yi Su, Fleming Lure, Jing LiFirst submitted…
Identifiable Shared Component Analysis of Unpaired Multimodal Mixturesby Subash Timilsina, Sagar Shrestha, Xiao FuFirst submitted…
CASPFormer: Trajectory Prediction from BEV Images with Deformable Attentionby Harsh Yadav, Maximilian Schaefer, Kun Zhao,…
Uni-Med: A Unified Medical Generalist Foundation Model For Multi-Task Learning Via Connector-MoEby Xun Zhu, Ying…