Scalable local density-based distributed clustering

作者:

Highlights:

摘要

Large amounts of high-dimensional data are distributed with the application of networks. Distributed clustering has become an increasingly important task due to variety of real-life constrains, including bandwidth and security aspects. Many distributed clustering algorithm have been proposed, but most of them have high transmission cost and poor clustering quality. In this paper, we propose a scalable local density-based distributed clustering algorithm which can easily fit high-dimensional data sets by this method such as density attractor distance and noise factor. In order to keep a lower transmission cost, we determine suitably low factor noises to send to the server. Furthermore, Test data sets, CMC data sets and KDD-CUP-99 are used for experimental evaluation to validate the performance practically. The experimental results and theoretical analysis show that the efficiency and quality for clustering of the proposed algorithm are superior to the other distributed clustering algorithm.

论文关键词:Distributed clustering,High-dimensional data,DBDC,SDBDC,LDBDC

论文评审过程:Available online 3 February 2011.

论文官网地址:https://doi.org/10.1016/j.eswa.2011.01.144