Summary of Improve Fidelity and Utility Of Synthetic Credit Card Transaction Time Series From Data-centric Perspective, by Din-yin Hsieh et al.
Improve Fidelity and Utility of Synthetic Credit Card Transaction Time Series from Data-centric Perspective
by Din-Yin Hsieh, Chi-Hua Wang, Guang Cheng
First submitted to arxiv on: 1 Jan 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: None
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper tackles the challenge of generating high-quality synthetic tabular data, specifically for sequential contexts like credit card transactions. The authors focus on creating a balance between data fidelity and machine learning utility. To achieve this, they introduce five pre-processing schemas to improve the training of the Conditional Probabilistic Auto-Regressive Model (CPAR). As the fidelity levels increase, the authors shift their attention to training fraud detection models tailored for time-series data, evaluating the synthetic data’s utility. The findings provide valuable insights and practical guidelines for practitioners in the finance sector, offering a transition from real to synthetic datasets. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps create fake but realistic data for machine learning tasks, like detecting credit card fraud. It’s hard to make this kind of data because it needs to look similar to real data, but still be useful for training models. The authors develop new ways to prepare the data, which makes it better and more accurate. They also test these methods by teaching machines to detect fake transactions in a dataset that looks like real credit card information. This research is important for people who work with finance data and want to train their models without using real data. |
Keywords
* Artificial intelligence * Attention * Machine learning * Synthetic data * Time series