Loading Now

Summary of Exploration and Evaluation Of Bias in Cyberbullying Detection with Machine Learning, by Andrew Root et al.


Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning

by Andrew Root, Liam Jakubowski, Mounika Vanamala

First submitted to arxiv on: 30 Nov 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper explores the impact of dataset quality and labeling on machine learning models’ ability to generalize to unseen data. Specifically, it uses three popular cyberbullying datasets to investigate how different data collection methods and labeling definitions affect model performance. The study highlights the importance of dataset curation and cross-dataset testing for creating models with real-world applicability. It also demonstrates that existing models can experience a significant drop in their Macro F1 Score when evaluated on unseen datasets, emphasizing the need for robust evaluation strategies.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research shows how machine learning models can be affected by the quality of data used to train them. The study looked at three types of cyberbullying data and found that different ways of collecting and labeling this data can impact how well a model works. The researchers also tested how well these models work on new, unseen data and found that they don’t perform as well as expected. This is important because it means that we need to be careful when creating and testing machine learning models to make sure they’re ready for real-world use.

Keywords

» Artificial intelligence  » F1 score  » Machine learning