FPO tree and DP3 algorithm for distributed parallel Frequent Itemsets Mining

作者:

Highlights:

• DP3, a high-performance distributed parallel algorithm for Frequent Itemsets Mining.

• FPO tree for optimal compactness for light transfers and efficient aggregations.

• Shared-memory parallel to exhaust multi-core CPUs resource of distributed nodes.

• Obtain memory scalability, load balance for shared-memory and distributed parallels.

• DP3 far outperforms well-known and recently high-performance algorithms.

摘要

•DP3, a high-performance distributed parallel algorithm for Frequent Itemsets Mining.•FPO tree for optimal compactness for light transfers and efficient aggregations.•Shared-memory parallel to exhaust multi-core CPUs resource of distributed nodes.•Obtain memory scalability, load balance for shared-memory and distributed parallels.•DP3 far outperforms well-known and recently high-performance algorithms.

论文关键词:Frequent Itemsets Mining,Parallel,Distributed,Data Mining,Big Data,Prefix tree

论文评审过程:Received 1 February 2019, Revised 12 June 2019, Accepted 14 August 2019, Available online 31 August 2019, Version of Record 8 September 2019.

论文官网地址:https://doi.org/10.1016/j.eswa.2019.112874