Summary of Videoorion: Tokenizing Object Dynamics in Videos, by Yicheng Feng et al.
VideoOrion: Tokenizing Object Dynamics in Videosby Yicheng Feng, Yijiang Li, Wanpeng Zhang, Hao Luo, Zihao…
VideoOrion: Tokenizing Object Dynamics in Videosby Yicheng Feng, Yijiang Li, Wanpeng Zhang, Hao Luo, Zihao…
Uni-Mlip: Unified Self-supervision for Medical Vision Language Pre-trainingby Ameera Bawazir, Kebin Wu, Wenbin LiFirst submitted…
Learning to Reason Iteratively and Parallelly for Complex Visual Reasoning Scenariosby Shantanu Jaiswal, Debaditya Roy,…
The Limited Impact of Medical Adaptation of Large Language and Vision-Language Modelsby Daniel P. Jeong,…
Dynamic Subset Tuning: Expanding the Operational Range of Parameter-Efficient Training for Large Language Modelsby Felix…
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMsby Kazuki Fujii, Taishi…
Deceiving Question-Answering Models: A Hybrid Word-Level Adversarial Approachby Jiyao Li, Mingze Ni, Yongshun Gong, Wei…
Pointwise Mutual Information as a Performance Gauge for Retrieval-Augmented Generationby Tianyu Liu, Jirui Qi, Paul…
Subgraph Retrieval Enhanced by Graph-Text Alignment for Commonsense Question Answeringby Boci Peng, Yongchao Liu, Xiaohe…
Self-Training Meets Consistency: Improving LLMs’ Reasoning with Consistency-Driven Rationale Evaluationby Jaehyeok Lee, Keisuke Sakaguchi, JinYeong…