An improved density peaks clustering algorithm with fast finding cluster centers

作者:

Highlights:

摘要

Fast and efficient are common requirements for all clustering algorithms. Density peaks clustering algorithm (DPC) can deal with non-spherical clusters well. However, due to the difficulty of large-scale data set storage and its high computational complexity, how to conduct effective data mining has become a challenge. To address this issue, we propose an improved density peaks clustering algorithm with fast finding cluster centers, which improves the efficiency of DPC algorithm by screening points with higher local density based on two novel prescreening strategies. The first strategy is based on the grid-division (GDPC), which screens points according to the density of corresponding grid cells. The second strategy is based on the circle-division (CDPC), which screens the points according to the uneven distribution of data sets in the corresponding circles. Theoretical analysis and experimental results show that both the prescreening strategies can reduce the calculation complexity, and the proposed algorithm not only more satisfied than DPC algorithm, but also superior than well-known Nyström-SC algorithm on the large-scale data sets. Moreover, due to the different theories of the two prescreening strategies, the first strategy is faster and the second strategy is more accurate on the large-scale data sets.

论文关键词:Density peaks clustering algorithm,Prescreening strategy,Large-scale data set,Decision graph,Computational complexity

论文评审过程:Received 28 November 2017, Revised 22 May 2018, Accepted 24 May 2018, Available online 25 May 2018, Version of Record 6 July 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.05.034