Summary of A Replica Analysis Of Under-bagging, by Takashi Takahashi
A replica analysis of under-bagging
by Takashi Takahashi
First submitted to arxiv on: 15 Apr 2024
Categories
- Main: Machine Learning (stat.ML)
- Secondary: Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the effectiveness of Under-bagging (UB) for training classifiers on imbalanced data. It explores whether using bagging to reduce variance caused by under-sampling is beneficial in generalized linear models. The authors heuristically derive a sharp asymptotics of UB and compare it with other popular methods, including under-sampling (US) and simple weighting (SW). They find that UB’s performance improves when increasing the majority class size while keeping the minority class fixed, unlike US which is almost independent of the majority class size. Additionally, SW with optimal weighting coefficients has similar performance to UB. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper looks at a way to train classifiers on data where one group has much more information than another. They compare different methods to see which works best. One method, called Under-bagging (UB), combines two other methods: under-sampling the larger group and using bagging (a technique that helps with noisy or missing data). The authors find that UB gets better as you add more examples from the larger group. This is different from another method, US, which doesn’t care how many examples you have from the larger group. They also compare UB to a third method, SW, which gives extra importance to the smaller group. Surprisingly, this third method does almost as well as UB. |
Keywords
» Artificial intelligence » Bagging