Summary of Zipvl: Efficient Large Vision-language Models with Dynamic Token Sparsification, by Yefei He et al.
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsificationby Yefei He, Feng Chen, Jing Liu,…
ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsificationby Yefei He, Feng Chen, Jing Liu,…
DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memoryby Yutong Wang, Jiali Zeng, Xuebo…
DifFRelight: Diffusion-Based Facial Performance Relightingby Mingming He, Pascal Clausen, Ahmet Levent Taşel, Li Ma, Oliver…
Rewriting Conversational Utterances with Instructed Large Language Modelsby Elnara Galimzhanova, Cristina Ioana Muntean, Franco Maria…
Towards Assurance of LLM Adversarial Robustness using Ontology-Driven Argumentationby Tomas Bueno Momcilovic, Beat Buesser, Giulio…
Personal Intelligence System UniLM: Hybrid On-Device Small Language Model and Server-Based Large Language Model for…
On Instruction-Finetuning Neural Machine Translation Modelsby Vikas Raunak, Roman Grundkiewicz, Marcin Junczys-DowmuntFirst submitted to arxiv…
Grounding is All You Need? Dual Temporal Grounding for Video Dialogby You Qin, Wei Ji,…
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translationby Rui Zhao, Jinyu…
Beyond Correlation: Interpretable Evaluation of Machine Translation Metricsby Stefano Perrella, Lorenzo Proietti, Pere-Lluís Huguet Cabot,…