Summary of Segment Anything For Videos: a Systematic Survey, by Chunhui Zhang et al.
Segment Anything for Videos: A Systematic Surveyby Chunhui Zhang, Yawen Cui, Weilin Lin, Guanjie Huang,…
Segment Anything for Videos: A Systematic Surveyby Chunhui Zhang, Yawen Cui, Weilin Lin, Guanjie Huang,…
Solving a Rubik’s Cube Using its Local Graph Structureby Shunyu Yao, Mitchy LeeFirst submitted to…
MagicFace: Training-free Universal-Style Human Image Customized Synthesisby Yibin Wang, Weizhong Zhang, Cheng JinFirst submitted to…
Style-Preserving Lip Sync via Audio-Aware Style Referenceby Weizhi Zhong, Jichang Li, Yinqi Cai, Liang Lin,…
EPAM-Net: An Efficient Pose-driven Attention-guided Multimodal Network for Video Action Recognitionby Ahmed Abdelkawy, Asem Ali,…
Data-Driven Pixel Control: Challenges and Prospectsby Saurabh Farkya, Zachary Alan Daniels, Aswin Raghavan, Gooitzen van…
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamicsby Ruining Li, Chuanxia…
Concept Conductor: Orchestrating Multiple Personalized Concepts in Text-to-Image Synthesisby Zebin Yao, Fangxiang Feng, Ruifan Li,…
Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with…
VizECGNet: Visual ECG Image Network for Cardiovascular Diseases Classification with Multi-Modal Training and Knowledge Distillationby…