Summary of Learning From Convolution-based Unlearnable Datastes, by Dohyun Kim et al.
Learning from Convolution-based Unlearnable Datastes
by Dohyun Kim, Pedro Sandoval-Segura
First submitted to arxiv on: 4 Nov 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The Convolution-based Unlearnable DAtaset (CUDA) method aims to protect large datasets used in deep learning from unauthorized use. By applying class-wise blurs to every image, CUDA makes data unlearnable for neural networks, which learn relations between blur kernels and labels instead of informative features. The authors evaluate whether CUDA data remains unlearnable after sharpening and frequency filtering, finding that this combination improves the utility of CUDA data for training. They demonstrate a substantial increase in test accuracy over adversarial training using CIFAR-10, CIFAR-100, and ImageNet-100 datasets. This work highlights the need for ongoing refinement in data poisoning techniques to ensure data privacy. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary CUDA is a method that makes large datasets used in deep learning unlearnable by applying class-wise blurs to every image. This helps protect the data from being used without permission. The authors tested CUDA and found that it still worked well even after making some changes to the images, like sharpening or adjusting their frequency. They showed that using CUDA data can improve how well a model performs on a test, compared to using data that has been intentionally made harder for the model to learn from. This is important because it shows that simple techniques can be used to make large datasets more secure. |
Keywords
* Artificial intelligence * Deep learning