Summary of Enhancing Contextual Understanding in Large Language Models Through Contrastive Decoding, by Zheng Zhao et al.
Enhancing Contextual Understanding in Large Language Models through Contrastive Decodingby Zheng Zhao, Emilio Monti, Jens…
Enhancing Contextual Understanding in Large Language Models through Contrastive Decodingby Zheng Zhao, Emilio Monti, Jens…
Grounding Realizable Entitiesby Michael Rabenberg, Carter Benson, Federico Donato, Yongqun He, Anthony Huffman, Shane Babcock,…
Grounded Knowledge-Enhanced Medical Vision-Language Pre-training for Chest X-Rayby Qiao Deng, Zhongzhen Huang, Yunqi Wang, Zhichuan…
Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localizationby Yongdong…
MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documentsby Liyan Tang, Philippe Laban, Greg DurrettFirst submitted…
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environmentsby Tianbao Xie, Danyang Zhang,…
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?by Junpeng Liu,…
Self-Explainable Affordance Learning with Embodied Captionby Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc…
Your Co-Workers Matter: Evaluating Collaborative Capabilities of Language Models in Blocks Worldby Guande Wu, Chen…
Annolid: Annotate, Segment, and Track Anything You Needby Chen Yang, Thomas A. ClelandFirst submitted to…