Summary of Measuring Sharpness in Grokking, by Jack Miller et al.

Measuring Sharpness in Grokking

by Jack Miller, Patrick Gleeson, Charles O’Neill, Thang Bui, Noam Levi

First submitted to arxiv on: 14 Feb 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This workshop paper introduces a robust technique for measuring neural network grokking, a phenomenon where networks achieve perfect or near-perfect performance on a validation set after already performing well on the training set. The authors use this method to investigate transitions in training and validation accuracy under two settings: a theoretical framework developed by Levi et al. (2023) and a two-layer MLP trained to predict parity of bits, with grokking induced by the concealment strategy of Miller et al. (2023). They find that trends between relative grokking gap and sharpness are similar in both settings when using absolute and relative measures of sharpness.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Neural networks sometimes get really good at doing something after they’ve already learned it. This is called “grokking”. Researchers want to understand how this happens. They came up with a way to measure grokking and used it to study two different types of neural networks. One type was made up of simple math formulas, while the other was trained to predict whether numbers were even or odd. Surprisingly, they found that both types of networks showed similar patterns when they got really good at doing things. This helps us understand what makes neural networks so smart sometimes.

Keywords

* Artificial intelligence * Neural network

Measuring Sharpness in Grokking

by Jack Miller, Patrick Gleeson, Charles O’Neill, Thang Bui, Noam Levi

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Moving Object Proposals with Deep Learned Optical Flow For Video Object Segmentation, by Ge Shi and Zhili Yang

Summary of Gradient Alignment with Prototype Feature For Fully Test-time Adaptation, by Juhyeon Shin and Jonghyun Lee and Saehyung Lee and Minjun Park and Dongjun Lee and Uiwon Hwang and Sungroh Yoon

Related Posts