Loading Now

Summary of Conditional Semi-supervised Data Augmentation For Spam Message Detection with Low Resource Data, by Ulin Nuha et al.


Conditional Semi-Supervised Data Augmentation for Spam Message Detection with Low Resource Data

by Ulin Nuha, Chih-Hsueh Lin

First submitted to arxiv on: 6 Jul 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Computation and Language (cs.CL)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a novel approach, Conditional Semi-Supervised Data Augmentation (CSSDA), to improve the effectiveness and robustness of spam detection models when labeled data is scarce. The CSSDA architecture combines feature extraction and an enhanced generative network to generate fake samples from unlabeled data through a conditional scheme. These latent variables can be used as input for a final classifier, enhancing the performance of the spam detection model. Experimental results show that CSSDA outperforms several related methods, achieving excellent results even with limited labeled data.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about a new way to detect spam messages on the internet. It’s hard to train machines to do this because we don’t have enough labeled examples. The researchers propose a method called CSSDA that uses fake samples from unlabeled data to help train the model. This makes the model better at detecting spam even when it doesn’t have much training data. The results show that this approach works really well and is more reliable than other methods.

Keywords

» Artificial intelligence  » Data augmentation  » Feature extraction  » Semi supervised