Summary of Meerkat: Audio-visual Large Language Model For Grounding in Space and Time, by Sanjoy Chowdhury et al.
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Timeby Sanjoy Chowdhury, Sayan Nag,…
Meerkat: Audio-Visual Large Language Model for Grounding in Space and Timeby Sanjoy Chowdhury, Sayan Nag,…
From Efficient Multimodal Models to World Models: A Surveyby Xinji Mai, Zeng Tao, Junxiong Lin,…
Curriculum Learning with Quality-Driven Data Selectionby Biao Wu, Fang Meng, Ling ChenFirst submitted to arxiv…
A Teacher Is Worth A Million Instructionsby Nikhil Kothari, Ravindra Nayak, Shreyas Shetty, Amey Patil,…
Demystifying Language Model Forgetting with Low-rank Example Associationsby Xisen Jin, Xiang RenFirst submitted to arxiv…
CityGPT: Empowering Urban Spatial Cognition of Large Language Modelsby Jie Feng, Yuwei Du, Tianhui Liu,…
Biomedical Visual Instruction Tuning with Clinician Preference Alignmentby Hejie Cui, Lingjun Mao, Xin Liang, Jieyu…
LiLiuM: eBay’s Large Language Models for e-commerceby Christian Herold, Michael Kozielski, Leonid Ekimov, Pavel Petrushkov,…
MMDU: A Multi-Turn Multi-Image Dialog Understanding Benchmark and Instruction-Tuning Dataset for LVLMsby Ziyu Liu, Tao…
Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularityby Bingxiang He, Ning Ding, Cheng…