Summary of A Review Of Multi-modal Large Language and Vision Models, by Kilian Carolan and Laura Fennelly and Alan F. Smeaton
A Review of Multi-Modal Large Language and Vision Modelsby Kilian Carolan, Laura Fennelly, Alan F.…
A Review of Multi-Modal Large Language and Vision Modelsby Kilian Carolan, Laura Fennelly, Alan F.…
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Modelby Musashi Hinck, Matthew L. Olson,…
FineFake: A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detectionby Ziyi Zhou, Xiaoming Zhang, Litian…
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generationby Rohan Chaudhury, Mihir Godbole, Aakash Garg,…
DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Modelby Lirui Zhao, Yue Yang,…
CHOPS: CHat with custOmer Profile Systems for Customer Service with LLMsby Jingzhe Shi, Jialuo Li,…
Fairness in Large Language Models: A Taxonomic Surveyby Zhibo Chu, Zichong Wang, Wenbin ZhangFirst submitted…
Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognitionby Umberto Michieli, Jijoong Moon, Daehyun Kim,…
OVFoodSeg: Elevating Open-Vocabulary Food Image Segmentation via Image-Informed Textual Representationby Xiongwei Wu, Sicheng Yu, Ee-Peng…
Generation and Detection of Sign Language Deepfakes - A Linguistic and Visual Analysisby Shahzeb Naeem,…