Loading Now

Summary of Cottention: Linear Transformers with Cosine Attention, by Gabriel Mongaras and Trevor Dohm and Eric C. Larson


Cottention: Linear Transformers With Cosine Attention

by Gabriel Mongaras, Trevor Dohm, Eric C. Larson

First submitted to arxiv on: 27 Sep 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
In this research paper, the authors introduce a novel attention mechanism called Cottention, which replaces the traditional softmax operation with cosine similarity. This new approach enables native linear memory complexity with respect to sequence length, making it more memory-efficient than softmax attention. The authors demonstrate that Cottention can be reformulated as a recurrent neural network (RNN) with a finite hidden state, allowing for constant memory usage during inference. They evaluate Cottention on the bidirectional BERT and causal GPT tasks, showing comparable performance to softmax attention while significantly reducing memory requirements.
Low GrooveSquid.com (original content) Low Difficulty Summary
Cottention is a new way to help computers understand long pieces of text without running out of memory. It’s like a special kind of attention that helps machines focus on important parts of sentences. This can make it easier for computers to process big texts, like articles or books, because they won’t need as much memory to do it. The authors tested Cottention and found that it works just as well as the old way, but uses less memory.

Keywords

» Artificial intelligence  » Attention  » Bert  » Cosine similarity  » Gpt  » Inference  » Neural network  » Rnn  » Softmax