Summary of Composite Active Learning: Towards Multi-domain Active Learning with Theoretical Guarantees, by Guang-yuan Hao et al.
Composite Active Learning: Towards Multi-Domain Active Learning with Theoretical Guarantees
by Guang-Yuan Hao, Hengguan Huang, Haotian Wang, Jie Gao, Hao Wang
First submitted to arxiv on: 3 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel active learning (AL) approach, dubbed composite active learning (CAL), is proposed to tackle the challenging multi-domain AL setting. Existing AL methods focus on single domains, whereas CAL explicitly considers domain-level and instance-level information. CAL assigns domain-level budgets based on domain-level importance, estimated by optimizing an upper error bound, and then selects samples to label from each domain using an instance-level query strategy. Theoretical analysis shows that CAL achieves a better error bound compared to current AL methods. Empirical results demonstrate the effectiveness of CAL on both synthetic and real-world multi-domain datasets. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Active learning (AL) helps improve model performance by choosing which data points to label first. Usually, this works well when all data comes from the same place. But what if you have data from different places? For example, pictures taken in different environments. This is called a multi-domain AL setting. Current methods don’t work well here because they ignore how similar or different each domain is and can’t handle changes in data distribution between domains. To solve this problem, we introduce composite active learning (CAL). CAL takes into account both the similarity of domains and which samples are most important to label. Our method does better than current AL methods on both fake and real datasets. |
Keywords
* Artificial intelligence * Active learning