Summary of Elastic: Efficient Linear Attention For Sequential Interest Compression, by Jiaxin Deng et al.
ELASTIC: Efficient Linear Attention for Sequential Interest Compression
by Jiaxin Deng, Shiyao Wang, Song Lu, Yinfeng Li, Xinchen Luo, Yuanjun Liu, Peixing Xu, Guorui Zhou
First submitted to arxiv on: 18 Aug 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Information Retrieval (cs.IR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed paper, ELASTIC (Efficient Linear Attention for SequenTial Interest Compression), addresses the scalability issue in sequential recommendation models that rely on transformers’ attention mechanisms. The authors introduce a linear dispatcher attention mechanism, which reduces the quadratic complexity and enables modeling of extremely long sequences up to 90% more efficiently with x2.7 inference speedup. To retain capacity for modeling various user interests, ELASTIC initializes a learnable interest memory bank and sparsely retrieves compressed user’s interests from it, maintaining negligible computational overhead. Experimental results on public datasets demonstrate ELASTIC outperforms baselines by a significant margin while showcasing its efficiency in modeling long sequences. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary ELASTIC is a new way to help recommend things you might like. Right now, some computers use “transformers” to do this, but they get slower and use more memory when dealing with very long lists of things. The team behind ELASTIC found a way to make it faster and more efficient by using a special kind of attention that doesn’t take up as much space or time. They also came up with a new way to store and retrieve information about what you like, so they can better understand your tastes and suggest things you’ll enjoy. |
Keywords
» Artificial intelligence » Attention » Inference