Two-stage clustering algorithm based on evolution and propagation patterns

作者:Peng Li, Haibin Xie

摘要

To solve the problem of current popular clustering algorithms needing to set the number of clusters and hyperparameters according to prior knowledge, we use the average nearest neighbour distance, a statistic that represents the characteristics of sample aggregation in the data space, and propose a two-stage clustering algorithm based on evolution and propagation patterns (EPC). In the evolution stage, the EPC algorithm obtains the initial clustering results and the number of clusters by evolving a small number of samples from random sampling in the data space in an incremental way. According to the nearest neighbour principle, the EPC propagates the cluster labels of the initial clustering results to the unlabelled samples in the propagation stage. Furthermore, the EPC algorithm uses a correction mechanism. It adopts Monte Carlo multiple simulation methods in the evolution stage to improve the stability of clustering results obtained by random sampling. Experiments on datasets and applications on image segmentation datasets show that the EPC algorithm is superior to the current popular clustering algorithm in performance. Finally, we conducted a systematic and comprehensive analysis of the EPC algorithm through ablation experiments, showing that the EPC algorithm has good robustness.

论文关键词:Clustering, Parameter-free, Evolution and propagation, Incremental way, Data mining

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-03016-8