Summary of Aim: Adaptive Inference Of Multi-modal Llms Via Token Merging and Pruning, by Yiwu Zhong et al.
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruningby Yiwu Zhong, Zhuoming Liu,…
AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruningby Yiwu Zhong, Zhuoming Liu,…
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Modelby Chunlin Yu, Hanqing Wang, Ye…
OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?by Zijian Chen, Tingzhu…
Human Action CLIPS: Detecting AI-generated Human Motionby Matyas Bohacek, Hany FaridFirst submitted to arxiv on:…
GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosisby Bo Liu,…
Creating Scalable AGI: the Open General Intelligence Frameworkby Daniel A. Dollinger, Michael SingletonFirst submitted to…
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmarkby Rong-Cheng Tu, Zi-Ao…
CCIS-Diff: A Generative Model with Stable Diffusion Prior for Controlled Colonoscopy Image Synthesisby Yifan Xie,…
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generationby Yushi Lan, Shangchen Zhou, Zhaoyang Lyu,…
Contrastive Language Prompting to Ease False Positives in Medical Anomaly Detectionby YeongHyeon Park, Myung Jin…