MrSQM: Fast Time Series Classification with Symbolic Representations

2 Sep 2021  ·  Thach Le Nguyen, Georgiana Ifrim ·

Symbolic representations of time series have proven to be effective for time series classification, with many recent approaches including SAX-VSM, BOSS, WEASEL, and MrSEQL. The key idea is to transform numerical time series to symbolic representations in the time or frequency domain, i.e., sequences of symbols, and then extract features from these sequences. While achieving high accuracy, existing symbolic classifiers are computationally expensive. In this paper we present MrSQM, a new time series classifier which uses multiple symbolic representations and efficient sequence mining, to extract important time series features. We study four feature selection approaches on symbolic sequences, ranging from fully supervised, to unsupervised and hybrids. We propose a new approach for optimal supervised symbolic feature selection in all-subsequence space, by adapting a Chi-squared bound developed for discriminative pattern mining, to time series. Our extensive experiments on 112 datasets of the UEA/UCR benchmark demonstrate that MrSQM can quickly extract useful features and learn accurate classifiers with the classic logistic regression algorithm. Interestingly, we find that a very simple and fast feature selection strategy can be highly effective as compared with more sophisticated and expensive methods. MrSQM advances the state-of-the-art for symbolic time series classifiers and it is an effective method to achieve high accuracy, with fast runtime.

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.