DBCURE-MR: An efficient density-based clustering algorithm for large data using MapReduce

作者:

Highlights:

• A density-based clustering algorithm DBCURE can find clusters with varying densities.

• DBCURE is a generalization of DBSCAN using ellipsoidal neighborhoods.

• We propose a parallel version of DBCURE, called DBCURE-MR, using MapReduce.

• DBCURE-MR finds clusters correctly based on the definition of density-based clusters.

• Experimental results show the efficiency and scalability of the proposed algorithms.

摘要

Highlights•A density-based clustering algorithm DBCURE can find clusters with varying densities.•DBCURE is a generalization of DBSCAN using ellipsoidal neighborhoods.•We propose a parallel version of DBCURE, called DBCURE-MR, using MapReduce.•DBCURE-MR finds clusters correctly based on the definition of density-based clusters.•Experimental results show the efficiency and scalability of the proposed algorithms.

论文关键词:Clustering algorithm,Density-based clustering,Parallel algorithm,MapReduce

论文评审过程:Received 23 April 2013, Revised 18 November 2013, Accepted 22 November 2013, Available online 1 December 2013.

论文官网地址:https://doi.org/10.1016/j.is.2013.11.002