Loading Now

Summary of Elastic: Efficient Linear Attention For Sequential Interest Compression, by Jiaxin Deng et al.


ELASTIC: Efficient Linear Attention for Sequential Interest Compression

by Jiaxin Deng, Shiyao Wang, Song Lu, Yinfeng Li, Xinchen Luo, Yuanjun Liu, Peixing Xu, Guorui Zhou

First submitted to arxiv on: 18 Aug 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Information Retrieval (cs.IR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed paper, ELASTIC (Efficient Linear Attention for SequenTial Interest Compression), addresses the scalability issue in sequential recommendation models that rely on transformers’ attention mechanisms. The authors introduce a linear dispatcher attention mechanism, which reduces the quadratic complexity and enables modeling of extremely long sequences up to 90% more efficiently with x2.7 inference speedup. To retain capacity for modeling various user interests, ELASTIC initializes a learnable interest memory bank and sparsely retrieves compressed user’s interests from it, maintaining negligible computational overhead. Experimental results on public datasets demonstrate ELASTIC outperforms baselines by a significant margin while showcasing its efficiency in modeling long sequences.
Low GrooveSquid.com (original content) Low Difficulty Summary
ELASTIC is a new way to help recommend things you might like. Right now, some computers use “transformers” to do this, but they get slower and use more memory when dealing with very long lists of things. The team behind ELASTIC found a way to make it faster and more efficient by using a special kind of attention that doesn’t take up as much space or time. They also came up with a new way to store and retrieve information about what you like, so they can better understand your tastes and suggest things you’ll enjoy.

Keywords

» Artificial intelligence  » Attention  » Inference