Summary of Language-enhanced Latent Representations For Out-of-distribution Detection in Autonomous Driving, by Zhenjiang Mao et al.
Language-Enhanced Latent Representations for Out-of-Distribution Detection in Autonomous Driving
by Zhenjiang Mao, Dong-You Jhong, Ao Wang, Ivan Ruchkin
First submitted to arxiv on: 2 May 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Machine Learning (cs.LG); Robotics (cs.RO)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel approach to out-of-distribution (OOD) detection in autonomous driving, leveraging large foundation models like CLIP. By using the cosine similarity of image and text representations encoded by CLIP as a new representation, the authors aim to improve the transparency and controllability of latent encodings for visual anomaly detection. The proposed method is compared to traditional pre-trained encoders that lack human interaction capabilities. Experimental results on realistic driving data show that the language-based latent representation outperforms traditional vision encoder representations and improves detection performance when combined with standard representations. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Imagine you’re in a self-driving car, and it needs to figure out if what’s happening is normal or weird. This paper shows how we can use big AI models to help the car do this better. We’re using these models to look at both images (like roads) and text (like directions), and then comparing them to see if anything seems off. This helps us detect when something unexpected happens, like a road sign or construction. Our tests show that this approach works really well and can even make the car’s detection better! |
Keywords
» Artificial intelligence » Anomaly detection » Cosine similarity » Encoder