Summary of Dia-llama: Towards Large Language Model-driven Ct Report Generation, by Zhixuan Chen et al.
Dia-LLaMA: Towards Large Language Model-driven CT Report Generationby Zhixuan Chen, Luyang Luo, Yequan Bie, Hao…
Dia-LLaMA: Towards Large Language Model-driven CT Report Generationby Zhixuan Chen, Luyang Luo, Yequan Bie, Hao…
An Intermediate Fusion ViT Enables Efficient Text-Image Alignment in Diffusion Modelsby Zizhao Hu, Shaochong Jia,…
Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularizationby Linzhi Wu, Xingyu Zhang, Yakun Zhang, Changyan…
Towards Human-Like Machine Comprehension: Few-Shot Relational Learning in Visually-Rich Documentsby Hao Wang, Tang Li, Chenhui…
MatchSeg: Towards Better Segmentation via Reference Image Matchingby Jiayu Huo, Ruiqiang Xiao, Haotian Zheng, Yang…
SensoryT5: Infusing Sensorimotor Norms into T5 for Enhanced Fine-grained Emotion Classificationby Yuhan Xia, Qingqing Zhao,…
Extending Token Computation for LLM Reasoningby Bingli Liao, Danilo Vasconcellos VargasFirst submitted to arxiv on:…
LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Modelsby Yuzhang Shang, Mu Cai, Bingxin Xu,…
ChatGPT Alternative Solutions: Large Language Models Surveyby Hanieh Alipour, Nick Pendar, Kohinoor RoyFirst submitted to…
M2DA: Multi-Modal Fusion Transformer Incorporating Driver Attention for Autonomous Drivingby Dongyang Xu, Haokun Li, Qingfan…