HierarchyScan: A Hierarchical Algorithm for Similarity Search in Databases Consisting of Long Sequences
作者:Chung-Sheng Li, Philip S. Yu, Vittorio Castelli
摘要
In this paper, a hierarchical algorithm, HierarchyScan, is proposed to efficiently locate one-dimensional subsequences within a collection of sequences with arbitrary length. The proposed algorithm performs correlation between the stored sequences and the template pattern in the transformed domain to identify subsequences in a scale- and phase-independent fashion. This is in contrast to those approaches based on the computation of Euclidean distance in the transformed domain. In the proposed hierarchical algorithm, the transformed domain representation of each original sequence is divided into multiple groups of coefficients. The matching is performed hierarchically from the group with the greatest filtering capability to the group with the lowest filtering capability. Only those subsequences whose maximum correlation value is higher than a predefined threshold will be selected for additional screening. This approach is compared to the sequential scanning and an order-of-magnitude speedup is observed.
论文关键词:Similarity search, template matching, time series, correlation, correlation coefficient, content-based retrieval, hierarchical search
论文评审过程:
论文官网地址:https://doi.org/10.1007/BF03325099