Summary of Confronting Discrimination in Classification: Smote Based on Marginalized Minorities in the Kernel Space For Imbalanced Data, by Lingyun Zhong
Confronting Discrimination in Classification: Smote Based on Marginalized Minorities in the Kernel Space for Imbalanced Data
by Lingyun Zhong
First submitted to arxiv on: 13 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach to financial fraud detection is proposed, tackling the challenge of class imbalance in minority samples. The existing mainstream classifiers often exhibit “implicit discrimination” against these critical minority samples, leading to frequent misclassifications. To address this issue, a new oversampling method is introduced, which carefully considers the distance between critical samples and the decision hyperplane, as well as the density of surrounding samples. This adaptive approach in kernel space is tested on a classic financial fraud dataset, showing improved classification accuracy for minority samples. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Financial fraud detection is a big problem that can cause huge economic losses if not done correctly. Right now, most methods used to detect fraud are good at detecting when someone is NOT committing fraud, but they often get it wrong when someone IS committing fraud. This is because the methods don’t take into account how rare and important these “fraud” cases are. A new way of doing this detection is being proposed, which takes into account the distance between the samples that might be fraudulent and the line that separates them from the rest. This helps to make sure that the correct samples are flagged as potential frauds. |
Keywords
* Artificial intelligence * Classification