Summary of Federated Causal Discovery From Heterogeneous Data, by Loka Li et al.
Federated Causal Discovery from Heterogeneous Data
by Loka Li, Ignavier Ng, Gongxu Luo, Biwei Huang, Guangyi Chen, Tongliang Liu, Bin Gu, Kun Zhang
First submitted to arxiv on: 20 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel federated causal discovery (FCD) method that can accommodate arbitrary causal models and heterogeneous data. The existing FCD methods have limitations, such as identifiable functional causal models or homogeneous data distributions, which restrict their applicability in diverse scenarios. To address this, the authors develop a surrogate variable corresponding to the client index to account for data heterogeneity across different clients. They also introduce a federated conditional independence test (FCIT) and federated independent change principle (FICP) for causal skeleton discovery and direction determination. The FCIT and FICP methods involve constructing summary statistics as a proxy of the raw data to protect data privacy, making no assumptions about particular functional forms. This facilitates handling arbitrary causal models. Extensive experiments on synthetic and real datasets demonstrate the efficacy of the proposed method. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is all about finding causes in big data. Right now, we have ways to do this but they only work if all the data is in one place. But that’s not always the case! Sometimes data is spread out across different places, which makes things harder. To fix this problem, the authors created a new way to find causes in decentralized data. They used special tricks to make sure the method works even when the data is different from place to place. This is important because it means we can use the same method for lots of different types of data. The authors tested their method on some fake and real data, and it worked really well! |