Summary of Linr: Model Based Neural Retrieval on Gpus at Linkedin, by Fedor Borisyuk et al.
LiNR: Model Based Neural Retrieval on GPUs at LinkedIn
by Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta
First submitted to arxiv on: 18 Jul 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper introduces LiNR, LinkedIn’s large-scale retrieval system designed for billion-sized indexes on GPU models. The authors discuss their experiences and challenges in creating scalable search indexes using TensorFlow and PyTorch at production scale. LiNR integrates both items and model weights into the model binary, treating index construction as a form of model training. The paper explores strategies for scaling the system to support large indexes, including full scans and efficient filtering. A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing common challenges in KNN searches that often reduce system quality. The authors also provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval. Quantization advancements are discussed to support larger indexes. LiNR has contributed to a 3% relative increase in professional daily active users on LinkedIn Feed through out-of-network post recommendations. This work represents one of the industry’s first live-updated model-based retrieval indexes, potentially simplifying complex infrastructures and enabling end-to-end optimization. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper talks about how to make searching faster and more efficient for a huge amount of data on computers. The authors created a system called LiNR that can handle billions of pieces of information quickly. They describe the challenges they faced while building this system and how they overcame them. One key idea is to help the computer find what you’re looking for by using special filters before it has to search through all the data. This makes the searching process faster and more accurate. The authors also share some new ways to improve searching that they discovered, which have helped make LinkedIn a better place for professionals. |
Keywords
» Artificial intelligence » Embedding » Optimization » Quantization