Summary of Robustness to Subpopulation Shift with Domain Label Noise Via Regularized Annotation Of Domains, by Nathan Stromberg and Rohan Ayyagari and Monica Welfert and Sanmi Koyejo and Richard Nock and Lalitha Sankar
Robustness to Subpopulation Shift with Domain Label Noise via Regularized Annotation of Domains
by Nathan Stromberg, Rohan Ayyagari, Monica Welfert, Sanmi Koyejo, Richard Nock, Lalitha Sankar
First submitted to arxiv on: 16 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes a novel approach to last-layer retraining called Regularized Annotation of Domains (RAD), which optimizes worst-group accuracy (WGA) without relying on explicit domain annotations. RAD is shown to be competitive with other annotation-free techniques and outperforms state-of-the-art annotation-reliant methods even in high-noise regimes, demonstrating its robustness. The paper highlights the limitations of existing WGA optimization methods that rely heavily on well-annotated groups in the training data and shows that these approaches are susceptible to domain annotation noise. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research introduces a new way to improve how machines learn from different types of data without needing labels for each type. Currently, this kind of learning is limited because it requires lots of labeled information, which can be time-consuming and expensive to create. The authors show that their method, called Regularized Annotation of Domains (RAD), works well even when the training data has errors or noise. This is important because in real-life situations, datasets often have mistakes or inconsistencies. RAD outperforms previous methods even when the training data has a small amount of errors. |
Keywords
* Artificial intelligence * Optimization