Summary of Grounding 3d Scene Affordance From Egocentric Interactions, by Cuiyu Liu et al.
Grounding 3D Scene Affordance From Egocentric Interactionsby Cuiyu Liu, Wei Zhai, Yuhang Yang, Hongchen Luo,…
Grounding 3D Scene Affordance From Egocentric Interactionsby Cuiyu Liu, Wei Zhai, Yuhang Yang, Hongchen Luo,…
See then Tell: Enhancing Key Information Extraction with Vision Groundingby Shuhang Liu, Zhenrong Zhang, Pengfei…
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusionby Ming Dai, Lingfeng Yang,…
LTNtorch: PyTorch Implementation of Logic Tensor Networksby Tommaso Carraro, Luciano Serafini, Fabio AiolliFirst submitted to…
MaPPER: Multimodal Prior-guided Parameter Efficient Tuning for Referring Expression Comprehensionby Ting Liu, Zunnan Xu, Yue…
Multi-Document Grounded Multi-Turn Synthetic Dialog Generationby Young-Suk Lee, Chulaka Gunasekara, Danish Contractor, Ramón Fernandez Astudillo,…
Question-Answering Dense Video Eventsby Hangyu Qin, Junbin Xiao, Angela YaoFirst submitted to arxiv on: 6…
From Grounding to Planning: Benchmarking Bottlenecks in Web Agentsby Segev Shlomov, Ben wiesel, Aviad Sela,…
Improving Apple Object Detection with Occlusion-Enhanced Distillationby Liang GengFirst submitted to arxiv on: 3 Sep…
Unlocking the Wisdom of Large Language Models: An Introduction to The Path to Artificial General…