Summary of Vita: Towards Open-source Interactive Omni Multimodal Llm, by Chaoyou Fu et al.
VITA: Towards Open-Source Interactive Omni Multimodal LLMby Chaoyou Fu, Haojia Lin, Zuwei Long, Yunhang Shen,…
VITA: Towards Open-Source Interactive Omni Multimodal LLMby Chaoyou Fu, Haojia Lin, Zuwei Long, Yunhang Shen,…
Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Modelsby…
ACL Ready: RAG Based Assistant for the ACL Checklistby Michael Galarnyk, Rutwik Routu, Kosha Bheda,…
EfficientRAG: Efficient Retriever for Multi-Hop Question Answeringby Ziyuan Zhuang, Zhiyang Zhang, Sitao Cheng, Fangkai Yang,…
Generative Language Models with Retrieval Augmented Generation for Automated Short Answer Scoringby Zifan Wang, Christopher…
EXAONE 3.0 7.8B Instruction Tuned Language Modelby LG AI Research, Soyoung An, Kyunghoon Bae, Eunbi…
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with…
Language Model Can Listen While Speakingby Ziyang Ma, Yakun Song, Chenpeng Du, Jian Cong, Zhuo…
Visual Grounding for Object-Level Generalization in Reinforcement Learningby Haobin Jiang, Zongqing LuFirst submitted to arxiv…
A Multi-Source Heterogeneous Knowledge Injected Prompt Learning Method for Legal Charge Predictionby Jingyun Sun, Chi…