Summary of Reddit-impacts: a Named Entity Recognition Dataset For Analyzing Clinical and Social Effects Of Substance Use Derived From Social Media, by Yao Ge and Sudeshna Das and Karen O’connor and Mohammed Ali Al-garadi and Graciela Gonzalez-hernandez and Abeed Sarker
Reddit-Impacts: A Named Entity Recognition Dataset for Analyzing Clinical and Social Effects of Substance Use Derived from Social Media
by Yao Ge, Sudeshna Das, Karen O’Connor, Mohammed Ali Al-Garadi, Graciela Gonzalez-Hernandez, Abeed Sarker
First submitted to arxiv on: 9 May 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper presents Reddit-Impacts, a Named Entity Recognition (NER) dataset curated from subreddits discussing prescription and illicit opioids, as well as medications for opioid use disorder. The dataset focuses on the clinical and social impacts of substance use reported by individuals with lived experiences. To create this resource, the authors collected data using the publicly available Reddit API and manually annotated text spans representing clinical and social impacts. The goal is to enable the development of systems that can automatically detect these impacts from text-based social media data, ultimately informing public health strategies. The authors also applied machine learning models like BERT, RoBERTa, DANN, and GPT-3.5 to establish baseline performances for automatic NER of clinical and social impacts. The dataset is available through the 2024 SMM4H shared tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper creates a new tool called Reddit-Impacts that helps us understand how substance use affects people’s health and society. Substance use disorders are a big problem, and we need to learn more about them to make better decisions. The authors took information from social media platforms like Reddit and labeled parts of it as “clinical” (related to health) or “social” (related to relationships and community). They also tested different machine learning models to see if they can recognize these impacts automatically. This tool will help us understand how substance use affects people’s lives and make better public health decisions. |
Keywords
» Artificial intelligence » Bert » Gpt » Machine learning » Named entity recognition » Ner