Summary of Extracting Pac Decision Trees From Black Box Binary Classifiers: the Gender Bias Study Case on Bert-based Language Models, by Ana Ozaki et al.
Extracting PAC Decision Trees from Black Box Binary Classifiers: The Gender Bias Study Case on BERT-based Language Models
by Ana Ozaki, Roberto Confalonieri, Ricardo Guimarães, Anders Imenes
First submitted to arxiv on: 13 Dec 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Decision trees can be used as surrogate models for complex AI models or approximations of parts of such models, providing inherent explainability. However, determining the accuracy and trustworthiness of the extracted decision tree is a key challenge. This paper investigates using the Probably Approximately Correct (PAC) framework to provide theoretical guarantees of fidelity for decision trees extracted from AI models. A PAC-guaranteed decision tree algorithm is adapted under certain conditions, focusing on binary classification. Experiments are conducted extracting decision trees from BERT-based language models with PAC guarantees, revealing occupational gender bias in these models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Decision trees can be used to make complex AI models easier to understand. But how do we know if the simplified model is accurate and trustworthy? This paper tries to solve this problem using a mathematical framework called Probably Approximately Correct (PAC). The researchers adapt their decision tree algorithm to work with this framework, focusing on simple yes or no answers (binary classification). They test their approach by simplifying AI models used for language tasks. Surprisingly, they find that these models have biases against certain occupations based on gender. |
Keywords
» Artificial intelligence » Bert » Classification » Decision tree