Loading Now

Summary of A Preliminary Study on Continual Learning in Computer Vision Using Kolmogorov-arnold Networks, by Alessandro Cacciatore et al.


A preliminary study on continual learning in computer vision using Kolmogorov-Arnold Networks

by Alessandro Cacciatore, Valerio Morelli, Federica Paganica, Emanuele Frontoni, Lucia Migliorelli, Daniele Berardini

First submitted to arxiv on: 20 Sep 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Deep learning has long been dominated by multi-layer perceptrons (MLPs), which have demonstrated superiority over other optimizable models in various domains. Recently, a new alternative to MLPs has emerged – Kolmogorov-Arnold Networks (KAN)- which are based on a fundamentally different mathematical framework. The authors of KAN claim that they address several major issues in MLPs, such as catastrophic forgetting in continual learning scenarios. However, this claim has only been supported by results from a regression task on a toy 1D dataset. In this paper, we extend the investigation by evaluating the performance of KANs in continual learning tasks within computer vision, specifically using the MNIST datasets. We compare the behavior of MLPs and two KAN-based models in a class-incremental learning scenario, ensuring that the architectures involved have the same number of trainable parameters. Our results demonstrate that an efficient version of KAN outperforms both traditional MLPs and the original KAN implementation. We also analyze the influence of hyperparameters in MLPs and KANs, as well as the impact of certain trainable parameters in KANs, such as bias and scale weights. Additionally, we provide a preliminary investigation of recent KAN-based convolutional networks and compare their performance with that of traditional convolutional neural networks.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper investigates a new type of deep learning model called Kolmogorov-Arnold Networks (KAN). KAN is an alternative to the popular multi-layer perceptrons (MLP) and claims to solve some major issues in MLPs, such as forgetting old information when learning new things. The authors tested KAN on a simple task and now want to see how it performs in more complex tasks, like recognizing handwritten digits. They compare KAN with traditional MLPs and another type of KAN. The results show that one version of KAN is better than the others.

Keywords

» Artificial intelligence  » Continual learning  » Deep learning  » Regression