Loading Now

Summary of Gci-vital: Gradual Confidence Improvement with Vision Transformers For Active Learning on Label Noise, by Moseli Mots’oehli and Kyungim Baek


GCI-ViTAL: Gradual Confidence Improvement with Vision Transformers for Active Learning on Label Noise

by Moseli Mots’oehli, kyungim Baek

First submitted to arxiv on: 8 Nov 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper explores active learning (AL) for image classification tasks, focusing on minimizing labeling costs while achieving high accuracy. The study compares AL methods on various datasets, including CIFAR10, CIFAR100, Food101, and Chest X-ray, under different label noise rates. It also investigates the impact of model architecture by comparing Convolutional Neural Networks (CNNs) and Vision Transformer (ViT)-based models. A novel deep active learning algorithm, GCI-ViTAL, is proposed to be robust to label noise, utilizing prediction entropy and attention vectors. The method selects informative data points while flagging potentially mislabeled candidates. Label smoothing is applied to train a model not overly confident about noisy labels. The paper evaluates GCI-ViTAL under varying levels of symmetric label noise and compares it to five other AL strategies, demonstrating significant performance improvements for ViTs over CNNs in noisy label settings.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about a way to make computers learn new things without needing as many examples. This is useful because labeling data can be time-consuming and expensive. The study looks at different ways to do this on images, using various types of models like Convolutional Neural Networks (CNNs) and Vision Transformer (ViT)-based models. A new algorithm called GCI-ViTAL is proposed that can handle mistakes in the labels. This helps the computer learn better even when some examples are wrong. The results show that using certain types of models and algorithms can lead to better performance, especially when there are errors in the labels.

Keywords

» Artificial intelligence  » Active learning  » Attention  » Image classification  » Vision transformer  » Vit