Summary of Conditional Semi-supervised Data Augmentation For Spam Message Detection with Low Resource Data, by Ulin Nuha et al.
Conditional Semi-Supervised Data Augmentation for Spam Message Detection with Low Resource Data
by Ulin Nuha, Chih-Hsueh Lin
First submitted to arxiv on: 6 Jul 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel approach, Conditional Semi-Supervised Data Augmentation (CSSDA), to improve the effectiveness and robustness of spam detection models when labeled data is scarce. The CSSDA architecture combines feature extraction and an enhanced generative network to generate fake samples from unlabeled data through a conditional scheme. These latent variables can be used as input for a final classifier, enhancing the performance of the spam detection model. Experimental results show that CSSDA outperforms several related methods, achieving excellent results even with limited labeled data. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a new way to detect spam messages on the internet. It’s hard to train machines to do this because we don’t have enough labeled examples. The researchers propose a method called CSSDA that uses fake samples from unlabeled data to help train the model. This makes the model better at detecting spam even when it doesn’t have much training data. The results show that this approach works really well and is more reliable than other methods. |
Keywords
» Artificial intelligence » Data augmentation » Feature extraction » Semi supervised