Storage-optimizing clustering algorithms for high-dimensional tick data

作者:

Highlights:

• We investigate tick data for applications that keep track of values changing rapidly over time, like stock market prices.

• We decompose tick data by clustering its attributes using Storage-Optimizing Hierarchical Agglomerative Clustering.

• We propose a method for speeding up the algorithm based on a new lower bounding technique.

• Experimental results show that the proposed approach compares favorably to several baselines in terms of compression.

• Additionally, the proposed approach can lead to significant speedup in terms of running time.

摘要

•We investigate tick data for applications that keep track of values changing rapidly over time, like stock market prices.•We decompose tick data by clustering its attributes using Storage-Optimizing Hierarchical Agglomerative Clustering.•We propose a method for speeding up the algorithm based on a new lower bounding technique.•Experimental results show that the proposed approach compares favorably to several baselines in terms of compression.•Additionally, the proposed approach can lead to significant speedup in terms of running time.

论文关键词:Tick data,Clustering,Storage

论文评审过程:Available online 10 January 2014.

论文官网地址:https://doi.org/10.1016/j.eswa.2013.12.046