Summary of Smartagent: Chain-of-user-thought For Embodied Personalized Agent in Cyber World, by Jiaqi Zhang et al.
SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber Worldby Jiaqi Zhang, Chen Gao, Liyuan Zhang,…
SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber Worldby Jiaqi Zhang, Chen Gao, Liyuan Zhang,…
Mobile Video Diffusionby Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas, Amir Ghodrati, Amirhossein HabibianFirst submitted…
Multimodal Contextualized Support for Enhancing Video Retrieval Systemby Quoc-Bao Nguyen-Le, Thanh-Huy Le-NguyenFirst submitted to arxiv…
Adapting to Non-Stationary Environments: Multi-Armed Bandit Enhanced Retrieval-Augmented Generation on Knowledge Graphsby Xiaqiang Tang, Jian…
OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotationsby Linke Ouyang, Yuan Qu, Hongbin Zhou,…
Piece of Table: A Divide-and-Conquer Approach for Selecting Subtables in Table Question Answeringby Wonjin Lee,…
RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Modelsby Greg Heinrich, Mike Ranzinger, Hongxu, Yao Lu,…
GASP: Gaussian Avatars with Synthetic Priorsby Jack Saunders, Charlie Hewitt, Yanan Jian, Marek Kowalski, Tadas…
SAT: Spatial Aptitude Training for Multimodal Language Modelsby Arijit Ray, Jiafei Duan, Reuben Tan, Dina…
Learning to Correction: Explainable Feedback Generation for Visual Commonsense Reasoning Distractorby Jiali Chen, Xusen Hei,…