Summary of Linr: Model Based Neural Retrieval on Gpus at Linkedin, by Fedor Borisyuk et al.

LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

by Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

First submitted to arxiv on: 18 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper introduces LiNR, LinkedIn’s large-scale retrieval system designed for billion-sized indexes on GPU models. The authors discuss their experiences and challenges in creating scalable search indexes using TensorFlow and PyTorch at production scale. LiNR integrates both items and model weights into the model binary, treating index construction as a form of model training. The paper explores strategies for scaling the system to support large indexes, including full scans and efficient filtering. A key focus is on enabling attribute-based pre-filtering for exhaustive GPU searches, addressing common challenges in KNN searches that often reduce system quality. The authors also provide multi-embedding retrieval algorithms and strategies for tackling cold start issues in retrieval. Quantization advancements are discussed to support larger indexes. LiNR has contributed to a 3% relative increase in professional daily active users on LinkedIn Feed through out-of-network post recommendations. This work represents one of the industry’s first live-updated model-based retrieval indexes, potentially simplifying complex infrastructures and enabling end-to-end optimization.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper talks about how to make searching faster and more efficient for a huge amount of data on computers. The authors created a system called LiNR that can handle billions of pieces of information quickly. They describe the challenges they faced while building this system and how they overcame them. One key idea is to help the computer find what you’re looking for by using special filters before it has to search through all the data. This makes the searching process faster and more accurate. The authors also share some new ways to improve searching that they discovered, which have helped make LinkedIn a better place for professionals.

Keywords

» Artificial intelligence » Embedding » Optimization » Quantization

LiNR: Model Based Neural Retrieval on GPUs at LinkedIn

by Fedor Borisyuk, Qingquan Song, Mingzhou Zhou, Ganesh Parameswaran, Madhu Arun, Siva Popuri, Tugrul Bingol, Zhuotao Pei, Kuang-Hsuan Lee, Lu Zheng, Qizhan Shao, Ali Naqvi, Sen Zhou, Aman Gupta

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Colormae: Exploring Data-independent Masking Strategies in Masked Autoencoders, by Carlos Hinojosa et al.

Summary of Non-contact Breath Rate Classification Using Svm Model and Mmwave Radar Sensor Data, by Mohammad Wassaf Ali et al.

Related Posts