Loading Now

Summary of Syncdr : Training Cross Domain Retrieval Models with Synthetic Data, by Samarth Mishra et al.


SynCDR : Training Cross Domain Retrieval Models with Synthetic Data

by Samarth Mishra, Carlos D. Castillo, Hongcheng Wang, Kate Saenko, Venkatesh Saligrama

First submitted to arxiv on: 31 Dec 2023

Categories

  • Main: Computer Vision and Pattern Recognition (cs.CV)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed approach addresses cross-domain retrieval, where a model identifies images from the same semantic category across two visual domains. The task involves retrieving real-world images from an online store’s catalog given sketches of objects, or vice versa. Traditional methods learn a feature space reflecting similarity using Euclidean distances and perform well even without human annotations. Our constraint is that the two domains may not share common categories in training data. To tackle this, we generate synthetic data by translating images between domains, which we compare to approaches specifically trained for translation or those leveraging large-scale pre-trained text-to-image diffusion models via prompts. The latter yields better results, leading to more accurate cross-domain retrieval models. Our best model achieves up to 15% improvement over prior art.
Low GrooveSquid.com (original content) Low Difficulty Summary
Cross-domain retrieval is a way for machines to find similar images across different types of pictures. For example, if you show a machine a sketch of a cat, it should be able to find a real-life picture of a cat online. The challenge is when the two kinds of images don’t have any common categories in their training data. We came up with a simple solution by creating fake images that fill in the missing categories across domains. This works better than other methods by using large-scale text-to-image models and prompts to generate the synthetic data. Our best approach can improve results by up to 15%.

Keywords

» Artificial intelligence  » Synthetic data  » Translation