Loading Now

Summary of Tilt Your Head: Activating the Hidden Spatial-invariance Of Classifiers, by Johann Schmidt et al.


Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers

by Johann Schmidt, Sebastian Stober

First submitted to arxiv on: 6 May 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computer Vision and Pattern Recognition (cs.CV)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Deep neural networks are widely used in various aspects of daily life, but they still struggle with robustly handling spatially transformed input signals. Current approaches to mitigate this issue involve either increasing sample variability (data augmentation) or explicitly constraining models with pre-defined inductive biases. However, these methods have limitations. Data augmentation is restricted by the size of the data space, making sufficient coverage intractable, while developing inductive biases for every possible scenario requires significant engineering effort. In contrast, our novel technique emulates human behavior by modifying percepts through mental or physical actions during inference. We propose a model-agnostic algorithm called Inverse Transformation Search (ITS) that traverses a sparsified inverse transformation tree during inference using parallel energy-based evaluations. ITS equips models with zero-shot pseudo-invariance to spatially transformed inputs. Our evaluation on benchmark datasets, including synthesised ImageNet test set, shows that ITS outperforms baselines in all zero-shot test scenarios.
Low GrooveSquid.com (original content) Low Difficulty Summary
Neural networks are used in many areas of life, but they have a big problem: they can’t handle things being moved or changed around them. Some people try to fix this by adding lots of different examples (data augmentation), but that only works if you have enough data. Others try to change the model’s thinking ahead of time (inductive biases), but that takes a lot of work. Instead, our new idea is like how humans think: we modify what we see based on what we do or think. We created an algorithm called Inverse Transformation Search (ITS) that helps models understand things even if they’re moved around. We tested it and it works better than other methods.

Keywords

» Artificial intelligence  » Data augmentation  » Inference  » Zero shot