Accelerating pattern-based time series classification: a linear time and space string mining approach

作者:Atif Raza, Stefan Kramer

摘要

Subsequences-based time series classification algorithms provide interpretable and generally more accurate classification models compared to the nearest neighbor approach, albeit at a considerably higher computational cost. A number of discretized time series-based algorithms have been proposed to reduce the computational complexity of these algorithms; however, the asymptotic time complexity of the proposed algorithms is also cubic or higher-order polynomial. We present a remarkably fast and resource-efficient time series classification approach which employs a linear time and space string mining algorithm for extracting frequent patterns from discretized time series data. Compared to other subsequence or pattern-based classification algorithms, the proposed approach only requires a few parameters, which can be chosen arbitrarily and do not require any fine-tuning for different datasets. The time series data are discretized using symbolic aggregate approximation, and frequent patterns are extracted using a string mining algorithm. An independence test is used to select the most discriminative frequent patterns, which are subsequently used to create a transformed version of the time series data. Finally, a classification model can be trained using any off-the-shelf algorithm. Extensive empirical evaluations demonstrate the competitive classification accuracy of our approach compared to other state-of-the-art approaches. The experiments also show that our approach is at least one to two orders of magnitude faster than the existing pattern-based methods due to the extremely fast frequent pattern extraction, which is the most computationally intensive process in pattern-based time series classification approaches.

论文关键词:Time series, Classification, String mining, Linear time and space

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-019-01378-7