Summary of Cross-attention Watermarking Of Large Language Models, by Folco Bertini Baldassini et al.
Cross-Attention Watermarking of Large Language Modelsby Folco Bertini Baldassini, Huy H. Nguyen, Ching-Chung Chang, Isao…
Cross-Attention Watermarking of Large Language Modelsby Folco Bertini Baldassini, Huy H. Nguyen, Ching-Chung Chang, Isao…
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wildby Zhi-Song Liu, Robin Courant,…
VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detectionby Zhe Wang, Siqi Fan, Xiaoliang…
UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generationby Lunhao Duan, Shanshan Zhao, Wenjun…
An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosisby Yingchen…
ObitoNet: Multimodal High-Resolution Point Cloud Reconstructionby Apoorv Thapliyal, Vinay Lanka, Swathi BaskaranFirst submitted to arxiv…
WiFi CSI Based Temporal Activity Detection via Dual Pyramid Networkby Zhendong Liu, Le Zhang, Bing…
A Full Transformer-based Framework for Automatic Pain Estimation using Videosby Stefanos Gkikas, Manolis TsiknakisFirst submitted…
Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learningby Eric Brouwer,…
Efficient Scaling of Diffusion Transformers for Text-to-Image Generationby Hao Li, Shamit Lal, Zhiheng Li, Yusheng…