Loading Now

Summary of Better Knowledge Enhancement For Privacy-preserving Cross-project Defect Prediction, by Yuying Wang et al.


Better Knowledge Enhancement for Privacy-Preserving Cross-Project Defect Prediction

by Yuying Wang, Yichen Li, Haozhao Wang, Lei Zhao, Xiaofang Zhang

First submitted to arxiv on: 23 Dec 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Software Engineering (cs.SE)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed Federated Learning (FL) approach, FedDP, addresses the challenge of Cross-Project Defect Prediction (CPDP) by leveraging data from multiple projects while preserving privacy. The model training is hindered by data heterogeneity across proprietary projects, which FedDP aims to overcome through two novel solutions: Local Heterogeneity Awareness and Global Knowledge Distillation. The distillation dataset consists of open-source project data, and the global model is optimized using a heterogeneity-aware local model ensemble via knowledge distillation. Experimental results on 19 projects from two datasets show that FedDP outperforms baselines.
Low GrooveSquid.com (original content) Low Difficulty Summary
FedDP helps predict defects in software projects without sharing sensitive data between companies. This problem is hard because different projects have different data, making it tough to train a good model. To solve this issue, researchers propose a new way of training models using Federated Learning. They use two ideas: first, they make sure the local models are aware of their own differences; second, they distill knowledge from one model to another. This approach uses open-source project data as a “teacher” and trains a global model with multiple local models that learn from each other. The results show that this method works better than others.

Keywords

» Artificial intelligence  » Distillation  » Federated learning  » Knowledge distillation