Loading Now

Summary of A Systematic Evaluation Of Generated Time Series and Their Effects in Self-supervised Pretraining, by Audrey Der et al.


A Systematic Evaluation of Generated Time Series and Their Effects in Self-Supervised Pretraining

by Audrey Der, Chin-Chia Michael Yeh, Xin Dai, Huiyuan Chen, Yan Zheng, Yujie Fan, Zhongfang Zhuang, Vivian Lai, Junpeng Wang, Liang Wang, Wei Zhang, Eamonn Keogh

First submitted to arxiv on: 15 Aug 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Self-supervised Pretrained Models (PTMs) have achieved impressive results in computer vision and natural language processing tasks. Researchers have applied this success to time series data, hoping for similar outcomes. However, our experiments reveal that most self-supervised time series PTMs are outperformed by simple supervised models. We attribute this phenomenon to data scarcity, hypothesizing that real-world datasets may be too limited to adequately train these models. To address this issue, we investigate six time series generation methods and examine how using generated data in place of real data affects classification performance. Our findings show that substituting real-data pretraining sets with larger volumes of generated samples leads to noticeable improvements.
Low GrooveSquid.com (original content) Low Difficulty Summary
Time series data is used for forecasting and analysis, but self-supervised Pretrained Models (PTMs) didn’t do well when applied to this type of data. Researchers tried to use PTMs in time series tasks, but they were beaten by simpler models that received training on real data. Scientists think this might be because there isn’t enough real-world data to train the PTMs properly. To fix this problem, experts tested different ways to create fake time series data and saw how it affects how well the models work at predicting certain outcomes. Using a lot of generated data instead of real data makes the models better.

Keywords

» Artificial intelligence  » Classification  » Natural language processing  » Pretraining  » Self supervised  » Supervised  » Time series