Out-of-core detection of periodicity from sequence databases

作者:Faraz Rasheed, Muhaimenul Adnan, Reda Alhajj

摘要

In this paper, we address the scalability problem of periodicity detection for time series and sequence databases. We present time and space efficient periodicity detection method that efficiently uses external memory (disk) when the series cannot be processed inside the available main memory. Our approach uses suffix tree to facilitate periodicity detection. We consider two cases, namely in-core and out of core. First, we optimize storage requirements of the suffix tree to be able to fit larger suffix trees in-core. This guarantees the ability to mine larger sequences when everything can be kept in-core, which is what the current periodicity detection approaches are able to mine. Second, when the data structures go out of core, we extend the suffix tree construction part to use external memory. We are able to achieve this while maintaining linear time complexity. As a result, when we go out of core, we can mine databases that require considerably larger space to keep the data structures compared to the available main memory. For the out-of-core periodicity detection part, the proposed method allows the required data structures to be kept on external memory partially when a memory overflow situation occurs. Various pruning strategies are also proposed to allow the proposed approach to process large sequences within reasonable amount of time. Additionally, we introduced the notion of “emulated tree traversal” for fast suffix tree traversal. Due to all these special considerations, we are able to mine much larger sequences compared to other existing periodicity detection algorithms. To demonstrate the applicability, power, and effectiveness of the proposed framework, we present results of periodicity detection up to 500 MB of time sequence data, which (to the best of our knowledge) is the largest reported sequence mined for periodicity detection ever.

论文关键词:Suffix tree, Periodicity detection, Sequence mining, External memory

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0546-1