Loading Now

Summary of Examining Modality Incongruity in Multimodal Federated Learning For Medical Vision and Language-based Disease Detection, by Pramit Saha et al.


Examining Modality Incongruity in Multimodal Federated Learning for Medical Vision and Language-based Disease Detection

by Pramit Saha, Divyanshu Mishra, Felix Wagner, Konstantinos Kamnitsas, J. Alison Noble

First submitted to arxiv on: 7 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Multimodal Federated Learning (MMFL) leverages multiple modalities in each client to create a more robust FL model than its unimodal counterpart. However, the impact of missing modality across clients, known as modality incongruity, has been largely overlooked. This paper investigates the effects of modality incongruity on data heterogeneity and reveals its connection with MMFL. The study examines whether incongruent MMFL with unimodal and multimodal clients is more beneficial than unimodal FL. To address this issue, three potential routes are explored: self-attention mechanisms for information fusion in MMFL, a modality imputation network (MIN) pre-trained in a multimodal client for modality translation in unimodal clients, and regularization techniques at the client-level and server-level. Experiments are conducted on two publicly available datasets, MIMIC-CXR and Open-I, with Chest X-Ray and radiology reports.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about a new way to improve how machines learn from lots of different sources. It’s called Multimodal Federated Learning (MMFL). MMFL uses information from many different types of data to make better predictions. But what happens when some of this data is missing? This paper figures out the impact of missing data and shows that it can affect how well MMFL works. The study looks at three ways to solve this problem: using special attention mechanisms, creating a fake version of the missing data, and using special techniques to make sure all the data is used equally.

Keywords

* Artificial intelligence  * Attention  * Federated learning  * Regularization  * Self attention  * Translation