Summary of Proxy Methods For Domain Adaptation, by Katherine Tsai et al.
Proxy Methods for Domain Adaptation
by Katherine Tsai, Stephen R. Pfohl, Olawale Salaudeen, Nicole Chiou, Matt J. Kusner, Alexander D’Amour, Sanmi Koyejo, Arthur Gretton
First submitted to arxiv on: 12 Mar 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed approach tackles domain adaptation under distribution shift, where an unobserved latent variable causes changes in covariates and labels. This setting defies traditional assumptions of covariate or label shifts. Proximal causal learning, a technique for estimating causal effects using proxy variables, is employed to adapt to the shift without explicitly modeling or recovering the latent variable. Two settings are considered: Concept Bottleneck, where an additional concept variable mediates the relationship between covariates and labels; and Multi-domain, where training data from multiple source domains with different distribution shifts is available. A two-stage kernel estimation approach is developed for adapting to complex distribution shifts in both settings. Experimental results show that this approach outperforms other methods, particularly those recovering the latent confounder. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Domain adaptation is a challenge when there’s an unobserved variable causing changes in both covariates and labels. This paper proposes a solution using proximal causal learning, which works with proxy variables instead of trying to recover or model the unknown variable. Two scenarios are explored: when you have more information about what’s driving the relationship between data and labels (Concept Bottleneck), or when you have training data from multiple sources that show different changes (Multi-domain). The approach is tested on complex distribution shifts, showing it does better than other methods. |
Keywords
* Artificial intelligence * Domain adaptation