Fast and accurate mining of correlated heavy hitters

作者:Italo Epicoco, Massimo Cafaro, Marco Pulimeno

摘要

The problem of mining correlated heavy hitters (CHH) from a two-dimensional data stream has been introduced recently, and a deterministic algorithm based on the use of the Misra–Gries algorithm has been proposed by Lahiri et al. to solve it. In this paper we present a new counter-based algorithm for tracking CHHs, formally prove its error bounds and correctness and show, through extensive experimental results, that our algorithm outperforms the Misra–Gries based algorithm with regard to accuracy and speed whilst requiring asymptotically much less space.

论文关键词:Data stream mining, Correlation, Heavy hitters

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10618-017-0526-x