Loading Now

Summary of Gansemble For Small and Imbalanced Data Sets: a Baseline For Synthetic Microplastics Data, by Daniel Platnick et al.


GANsemble for Small and Imbalanced Data Sets: A Baseline for Synthetic Microplastics Data

by Daniel Platnick, Sourena Khanzadeh, Alireza Sadeghian, Richard Anthony Valenzano

First submitted to arxiv on: 10 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel deep learning framework called GANsemble is proposed to overcome the challenges posed by limited and imbalanced data in understanding the potential harms of microplastic particle ingestion or inhalation. The framework combines data augmentation with conditional generative adversarial networks (cGANs) to generate class-conditioned synthetic data. The two-module framework consists of a data chooser module that automates the selection of the best data augmentation strategy and a cGAN module that uses this strategy to train a cGAN for generating enhanced synthetic data. The GANsemble framework is experimentally evaluated on a small and imbalanced microplastics data set, with a Microplastic-cGAN (MPcGAN) algorithm introduced and baselines established in terms of Frechet Inception Distance (FID) and Inception Scores (IS). Additionally, a synthetic microplastics filter (SYMP-Filter) algorithm is presented to increase the quality of generated synthetic microplastics data.
Low GrooveSquid.com (original content) Low Difficulty Summary
Microplastic particles are tiny plastic pieces that can be ingested or inhaled by humans. Scientists need more data to understand their potential harm. Unfortunately, current machine learning methods struggle with limited and imbalanced data. This paper proposes a new way to generate synthetic data using deep learning techniques. The method combines two steps: selecting the best augmentation strategy and generating synthetic data. The researchers tested this method on a small microplastics dataset and introduced a new algorithm called MPcGAN. They also set baselines for evaluating the quality of generated synthetic data.

Keywords

* Artificial intelligence  * Data augmentation  * Deep learning  * Machine learning  * Synthetic data