Loading Now

Summary of Language-enhanced Latent Representations For Out-of-distribution Detection in Autonomous Driving, by Zhenjiang Mao et al.


Language-Enhanced Latent Representations for Out-of-Distribution Detection in Autonomous Driving

by Zhenjiang Mao, Dong-You Jhong, Ao Wang, Ivan Ruchkin

First submitted to arxiv on: 2 May 2024

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Machine Learning (cs.LG); Robotics (cs.RO)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a novel approach to out-of-distribution (OOD) detection in autonomous driving, leveraging large foundation models like CLIP. By using the cosine similarity of image and text representations encoded by CLIP as a new representation, the authors aim to improve the transparency and controllability of latent encodings for visual anomaly detection. The proposed method is compared to traditional pre-trained encoders that lack human interaction capabilities. Experimental results on realistic driving data show that the language-based latent representation outperforms traditional vision encoder representations and improves detection performance when combined with standard representations.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine you’re in a self-driving car, and it needs to figure out if what’s happening is normal or weird. This paper shows how we can use big AI models to help the car do this better. We’re using these models to look at both images (like roads) and text (like directions), and then comparing them to see if anything seems off. This helps us detect when something unexpected happens, like a road sign or construction. Our tests show that this approach works really well and can even make the car’s detection better!

Keywords

» Artificial intelligence  » Anomaly detection  » Cosine similarity  » Encoder