Summary of Near-infrared and Low-rank Adaptation Of Vision Transformers in Remote Sensing, by Irem Ulku et al.
Near-Infrared and Low-Rank Adaptation of Vision Transformers in Remote Sensing
by Irem Ulku, O. Ozgur Tanriover, Erdem Akagündüz
First submitted to arxiv on: 28 May 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach is proposed to address the challenge of training deep neural networks on high-resolution Near-Infrared (NIR) reflectance images, which are crucial for dynamic plant health monitoring. Existing methods rely on pre-training large networks in the RGB domain and fine-tuning them for infrared images, introducing a domain shift issue due to differences between RGB and NIR visual traits. To overcome this limitation, the authors suggest using vision transformer (ViT) backbones pre-trained in the RGB domain, with low-rank adaptation (LoRA) for downstream tasks in the NIR domain. Extensive experiments demonstrate that employing LoRA with pre-trained ViT backbones yields the best performance for downstream tasks applied to NIR images. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This study explores a new way to train AI models on special kinds of photos that help us monitor plant health. Right now, it’s hard to get and label these pictures, so people use big networks trained on regular color photos and then adjust them for the special infrared pictures. This can cause problems because the two types of pictures look very different. To solve this problem, researchers suggest using a special kind of AI model called Vision Transformer (ViT) that was trained on regular color photos, but then adjusting it in a way that’s more efficient and effective for the special infrared pictures. |
Keywords
» Artificial intelligence » Fine tuning » Lora » Low rank adaptation » Vision transformer » Vit