DDR: an index method for large time-series datasets

作者:

Highlights:

摘要

The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach.

论文关键词:Time series,Indexing,Dimensionality reduction

论文评审过程:Received 14 May 2003, Revised 5 February 2004, Accepted 5 May 2004, Available online 17 June 2004.

论文官网地址:https://doi.org/10.1016/j.is.2004.05.001