Summary of Multi-level Matching Network For Multimodal Entity Linking, by Zhiwei Hu et al.
Multi-level Matching Network for Multimodal Entity Linkingby Zhiwei Hu, Víctor Gutiérrez-Basulto, Ru Li, Jeff Z.…
Multi-level Matching Network for Multimodal Entity Linkingby Zhiwei Hu, Víctor Gutiérrez-Basulto, Ru Li, Jeff Z.…
Steganography in Game Actionsby Ching-Chun Chang, Isao EchizenFirst submitted to arxiv on: 11 Dec 2024CategoriesMain:…
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretizationby Zhentao Tan, Ben Xue, Jian Jia, Junhao…
Unlocking Visual Secrets: Inverting Features with Diffusion Priors for Image Reconstructionby Sai Qian Zhang, Ziyun…
Disentanglement and Compositionality of Letter Identity and Letter Position in Variational Auto-Encoder Vision Modelsby Bruno…
Geo-LLaVA: A Large Multi-Modal Model for Solving Geometry Math Problems with Meta In-Context Learningby Shihao…
FovealNet: Advancing AI-Driven Gaze Tracking Solutions for Optimized Foveated Rendering System Performance in Virtual Realityby…
Enriching Multimodal Sentiment Analysis through Textual Emotional Descriptions of Visual-Audio Contentby Sheng Wu, Xiaobao Wang,…
Automatic Detection, Positioning and Counting of Grape Bunches Using Robotsby Xumin GaoFirst submitted to arxiv…
VCA: Video Curious Agent for Long Video Understandingby Zeyuan Yang, Delin Chen, Xueyang Yu, Maohao…