Summary of Randar: Decoder-only Autoregressive Visual Generation in Random Orders, by Ziqi Pang et al.
RandAR: Decoder-only Autoregressive Visual Generation in Random Ordersby Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze…
RandAR: Decoder-only Autoregressive Visual Generation in Random Ordersby Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze…
The use of large language models to enhance cancer clinical trial educational materialsby Mingye Gao,…
Enhancing Zero-shot Chain of Thought Prompting via Uncertainty-Guided Strategy Selectionby Shanu Kumar, Saish Mendke, Karody…
Hybrid Discriminative Attribute-Object Embedding Network for Compositional Zero-Shot Learningby Yang Liu, Xinshuo Wang, Jiale Du,…
Relation-Aware Meta-Learning for Zero-shot Sketch-Based Image Retrievalby Yang Liu, Jiale Du, Xinbo Gao, Jungong HanFirst…
ShowUI: One Vision-Language-Action Model for GUI Visual Agentby Kevin Qinghong Lin, Linjie Li, Difei Gao,…
g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasksby Zihan Wang, Gim Hee LeeFirst submitted to…
Improved GUI Grounding via Iterative Narrowingby Anthony NguyenFirst submitted to arxiv on: 18 Nov 2024CategoriesMain:…
Leveraging MLLM Embeddings and Attribute Smoothing for Compositional Zero-Shot Learningby Xudong Yan, Songhe Feng, Yang…
Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucinationby Haojie Zheng, Tianyang Xu,…