Loading Now

Summary of Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings, By Yichen Jiang et al.


Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings

by Yichen Jiang, Xiang Zhou, Mohit Bansal

First submitted to arxiv on: 9 Feb 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes SQ-Transformer (Structurally Quantized), an extension of the original Transformer model, designed to improve compositional generalization on complex datasets. The proposed approach, SQ-Transformer, introduces two novel components: Structure-oriented Vector Quantization (SoVQ) and Systematic Attention Layer (SAL). SoVQ clusters word embeddings into structurally equivalent entities, while SAL induces invariant or similar attention patterns for sentences with the same syntactic structure. In contrast to vanilla Transformers that overfit on low-complexity datasets, SQ-Transformer demonstrates improved compositional generalization on multiple semantic parsing and machine translation benchmarks. The authors show that SoVQ indeed learns a syntactically clustered embedding space and SAL induces generalizable attention patterns, leading to better systematicity.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper creates a new way for computers to understand complex sentences by teaching them about the structure of language. It’s called SQ-Transformer. This model is better at learning new sentences that have a similar structure than other models. The researchers found that when they taught the model about sentence structure, it got much better at understanding sentences with different words but the same structure. They tested this model on several tasks and showed that it works well.

Keywords

* Artificial intelligence  * Attention  * Embedding space  * Generalization  * Quantization  * Semantic parsing  * Transformer  * Translation