Summary of Temporal Grounding Of Activities Using Multimodal Large Language Models, by Young Chol Song
Temporal Grounding of Activities using Multimodal Large Language Modelsby Young Chol SongFirst submitted to arxiv…
Temporal Grounding of Activities using Multimodal Large Language Modelsby Young Chol SongFirst submitted to arxiv…
Tumor likelihood estimation on MRI prostate data by utilizing k-Space informationby M. Rempe, F. Hörst,…
Fine-Grained Multi-View Hand Reconstruction Using Inverse Renderingby Qijun Gan, Wentong Li, Jinwei Ren, Jianke ZhuFirst…
Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression Recognitionby Yuxiang…
Pruning Large Language Models to Intra-module Low-rank Architecture with Transitional Activationsby Bowen Shen, Zheng Lin,…
InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instructby Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang,…
Fast and Continual Knowledge Graph Embedding via Incremental LoRAby Jiajun Liu, Wenjun Ke, Peng Wang,…
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devicesby Jianwen Jiang, Gaojie Lin, Zhengkun Rong,…
MSP-Podcast SER Challenge 2024: L’antenne du Ventoux Multimodal Self-Supervised Learning for Speech Emotion Recognitionby Jarod…
Short-term Object Interaction Anticipation with Disentangled Object Detection @ Ego4D Short Term Object Interaction Anticipation…