Summary of Post-hoc Reversal: Are We Selecting Models Prematurely?, by Rishabh Ranjan et al.

Post-Hoc Reversal: Are We Selecting Models Prematurely?

by Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Lipton

First submitted to arxiv on: 11 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper presents a comprehensive empirical study on the effectiveness of post-hoc transforms in improving the performance, robustness, and uncertainty estimation of trained models. The authors demonstrate a phenomenon called post-hoc reversal, where the performance trends are reversed after applying post-hoc transforms, particularly in high-noise settings. This phenomenon is shown to affect not only base models but also ensembling and stochastic weight averaging (SWA). The study highlights the importance of considering these post-hoc transforms during model development decisions such as early stopping, checkpointing, and hyperparameter choices. The authors propose a simple technique called post-hoc selection that leverages post-hoc metrics to inform model development decisions. This approach is shown to result in significant improvements, with an LLM instruction tuning dataset achieving >1.5x MMLU improvement compared to naive selection.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper shows that using special techniques after a model is trained can actually make it worse. They call this “post-hoc reversal” and found that it happens often when there’s a lot of noise in the data. For example, they found that ensembling (combining multiple models) and SWA (averaging the weights of different models) both tend to favor base models trained for longer periods. This means that you might need to rethink how you train your models and make changes based on how well they perform after these techniques are applied.

Keywords

» Artificial intelligence » Early stopping » Hyperparameter » Instruction tuning

Post-Hoc Reversal: Are We Selecting Models Prematurely?

by Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Lipton

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Researchagent: Iterative Research Idea Generation Over Scientific Literature with Large Language Models, by Jinheon Baek et al.

Summary of Reinforcement Learning with Generalizable Gaussian Splatting, by Jiaxu Wang et al.

Related Posts