A high-performance distributed algorithm for mining association rules

作者:Assaf Schuster, Ran Wolff, Dan Trock

摘要

We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speed-up with the number of computing nodes. The algorithm is the first D-ARM algorithm to perform a single scan over the database. As such, its performance is unmatched by any previous algorithm. Scale-up experiments over standard synthetic benchmarks demonstrate stable run time regardless of the number of computers. Theoretical analysis reveals a tighter bound on error probability than the one shown in the corresponding sequential algorithm. As a result of this tighter bound and by utilizing the combined memory of several computers, the algorithm generates far fewer candidates than comparable sequential algorithms—the same order of magnitude as the optimum.

论文关键词:Association rule, Data mining, Distributed data mining, High-performance computing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-004-0176-3