A general-purpose distributed pattern mining system

作者:Asma Belhadi, Youcef Djenouri, Jerry Chun-Wei Lin, Alberto Cano

摘要

This paper explores five pattern mining problems and proposes a new distributed framework called DT-DPM: Decomposition Transaction for Distributed Pattern Mining. DT-DPM addresses the limitations of the existing pattern mining problems by reducing the enumeration search space. Thus, it derives the relevant patterns by studying the different correlation among the transactions. It first decomposes the set of transactions into several clusters of different sizes, and then explores heterogeneous architectures, including MapReduce, single CPU, and multi CPU, based on the densities of each subset of transactions. To evaluate the DT-DPM framework, extensive experiments were carried out by solving five pattern mining problems (FIM: Frequent Itemset Mining, WIM: Weighted Itemset Mining, UIM: Uncertain Itemset Mining, HUIM: High Utility Itemset Mining, and SPM: Sequential Pattern Mining). Experimental results reveal that by using DT-DPM, the scalability of the pattern mining algorithms was improved on large databases. Results also reveal that DT-DPM outperforms the baseline parallel pattern mining algorithms on big databases.

论文关键词:Pattern mining, Decomposition, Distributed computing, Heterogeneous architecture

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01664-w