Loading Now

Summary of Mixmax: Distributional Robustness in Function Space Via Optimal Data Mixtures, by Anvith Thudi et al.


MixMax: Distributional Robustness in Function Space via Optimal Data Mixtures

by Anvith Thudi, Chris J. Maddison

First submitted to arxiv on: 3 Jun 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper introduces a new approach to address limitations in current group distributionally robust optimization (group DRO) methods. The authors reparameterize group DRO from parameter space to function space, leading to several advantages. Specifically, they show that group DRO over the space of bounded functions admits a minimax theorem and that the minimax optimal mixture distribution can be solved using a simple convex optimization problem. This approach, called MixMax, allows for the efficient optimization of non-convex losses and non-parametric model classes. The authors demonstrate the effectiveness of MixMax in experiments on various datasets, including ACSIncome and CelebA, outperforming standard group DRO baselines.
Low GrooveSquid.com (original content) Low Difficulty Summary
Machine learning models need to work well across different groups or settings. To measure this, we use something called “worst-case performance.” Unfortunately, current methods struggle when the model is not simple or when the problem is hard. This paper makes a clever move by looking at the problem in a new way. Instead of working with the model’s parameters, they look at the functions that the model can learn. This helps them solve some big problems. They show that their approach, called MixMax, works well on many datasets and even beats the current best methods.

Keywords

» Artificial intelligence  » Machine learning  » Optimization