Summary of Evaluating Pre-training Bias on Severe Acute Respiratory Syndrome Dataset, by Diego Dimer Rodrigues
Evaluating Pre-Training Bias on Severe Acute Respiratory Syndrome Dataset
by Diego Dimer Rodrigues
First submitted to arxiv on: 27 Aug 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Machine learning has many practical applications in health, but as data grows and models become more prevalent, concerns about bias arise. Bias can lead to harm by basing decisions on sensitive attributes like gender or ethnicity. To mitigate this, visualization techniques can generate insights into large datasets, helping data scientists understand the data before training a model. This work uses the severe acute respiratory syndrome dataset from OpenDataSUS to visualize three pre-training bias metrics and their distribution across different regions in Brazil. A random forest model is trained in each region and applied to others to compare bias for different regions, focusing on protected attributes and comparing performance with metric values. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper uses machine learning to help make better decisions in health care. Right now, there’s a problem: some models can be unfair because they’re based on things like where you live or what your gender is. This can hurt people. To fix this, the paper looks at big datasets and tries to understand them before training a model. They use a special dataset about severe respiratory syndrome from OpenDataSUS. Then, they train a model in each region of Brazil and see how well it does. The goal is to make sure that these models don’t have biases against certain groups. |
Keywords
» Artificial intelligence » Machine learning » Random forest