Summary of An Experiment on Feature Selection Using Logistic Regression, by Raisa Islam et al.

An Experiment on Feature Selection using Logistic Regression

by Raisa Islam, Subhasish Mazumdar, Rakibul Islam

First submitted to arxiv on: 31 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates a method for feature selection in supervised machine learning using L1 and L2 regularization strategies with logistic regression (LR). The approach synthesizes findings from both methods to enhance explainability and performance. The CIC-IDS2018 dataset is used, which has two problematic classes that are hard to separate. The study compares LR+L1 against LR+L2 by varying the feature set sizes for each ranking. No significant difference in accuracy is found between the two methods once the feature set is selected. A synthesized feature set is also tested on Decision Tree and Random Forest models, showing close accuracy despite a small feature set size.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper looks at how to choose which features are most important in machine learning. It uses two techniques called L1 and L2 regularization with logistic regression to do this. The study chooses a big dataset that has some tricky classes to separate. It then compares the results of using L1 or L2 regularization separately, and also combines them to see if it helps. The paper finds that combining the methods doesn’t make much difference in how well the model performs. It also tests the combined method on more complex models like Decision Trees and Random Forests, and sees that they do pretty well despite using fewer features.

Keywords

* Artificial intelligence * Decision tree * Feature selection * Logistic regression * Machine learning * Random forest * Regularization * Supervised

An Experiment on Feature Selection using Logistic Regression

by Raisa Islam, Subhasish Mazumdar, Rakibul Islam

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Fully Data-driven Model For Increasing Sampling Rate Frequency Of Seismic Data Using Super-resolution Generative Adversarial Networks, by Navid Gholizadeh and Javad Katebi

Summary of Survey Of Privacy Threats and Countermeasures in Federated Learning, by Masahiro Hayashitani et al.

Related Posts