Loading Now

Summary of When Representations Align: Universality in Representation Learning Dynamics, by Loek Van Rossem et al.


When Representations Align: Universality in Representation Learning Dynamics

by Loek van Rossem, Andrew M. Saxe

First submitted to arxiv on: 14 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Neurons and Cognition (q-bio.NC)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Deep neural networks can have diverse sizes and structures. The combination of architecture, dataset, and learning algorithm influences the learned representations. Despite this, recent research has revealed that different architectures can produce similar representation patterns. This paper develops an effective theory for understanding representation learning, assuming that the encoding and decoding maps are smooth functions. This framework captures dynamics in complex networks where hidden representations aren’t strongly influenced by parametrization. The theory is tested across various deep networks with different activation functions and structures, exhibiting phenomena akin to the “rich” and “lazy” regime. While many network behaviors depend on architecture, this research suggests certain behaviors are preserved when models become flexible.
Low GrooveSquid.com (original content) Low Difficulty Summary
Deep neural networks come in all shapes and sizes! Recent studies showed that different architectures learn similar patterns. This paper creates a special tool to understand how these patterns are formed. It assumes that the way data is encoded and decoded is smooth and works similarly for complex networks where hidden patterns aren’t strongly tied to architecture. The results show that this framework works across many different types of deep networks with various activation functions and structures. This research helps us understand what makes certain behaviors appear in neural networks.

Keywords

* Artificial intelligence  * Representation learning