Summary of Data Augmentation Method For Modeling Health Records with Applications to Clopidogrel Treatment Failure Detection, by Sunwoong Choi and Samuel Kim
Data augmentation method for modeling health records with applications to clopidogrel treatment failure detection
by Sunwoong Choi, Samuel Kim
First submitted to arxiv on: 28 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper introduces a novel method for addressing data scarcity in modeling longitudinal patterns in Electronic Health Records (EHR) using natural language processing (NLP) algorithms. The proposed augmentation technique rearranges medical record orders within a visit, where the order of elements is not obvious. This approach improves performance in tasks like clopidogrel treatment failure detection by up to 5.3% absolute improvement in ROC-AUC when used during pre-training. The augmentation also enhances fine-tuning procedures, especially when labeled training data is limited. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps solve a problem with using computer algorithms to analyze patient records from hospitals. These records are important for understanding how patients change over time. The issue is that there isn’t enough information in these records to train the algorithms well. To fix this, the authors developed a new way to generate extra data by shuffling the order of medical records within each visit. This helps the algorithms learn better and makes them more accurate at predicting things like whether patients will have problems with certain medicines. |
Keywords
* Artificial intelligence * Auc * Fine tuning * Natural language processing * Nlp