An improved path-based clustering algorithm

作者:

Highlights:

摘要

Path-based clustering algorithms usually generate clusters by optimizing a criterion function. Most of state-of-the-art optimization methods give a solution close to the global optimum. By analyzing the minimax distance, we find that cluster centers have the minimum density in their own clusters. Inspired by this, we propose an improved path-based clustering algorithm (IPC) by mining the cluster centers of the dataset. IPC solves this problem by the process of elimination since it is difficult to mine these cluster centers directly. The algorithm can achieve the global optimum within O(n2). Experimental results on synthetic datasets show that IPC not only can recognize all kinds of clusters regardless of their shapes, sizes and densities, but also is robust against noises and outliers in the data. More importantly, IPC needs only one parameter (i.e., the number of clusters). Comparing IPC with other clustering algorithms on the real datasets, the experimental results show that IPC outperforms compared clustering algorithms.

论文关键词:Clustering,Global optimization,Minimax distance,MST,Path-based

论文评审过程:Received 14 October 2017, Revised 29 May 2018, Accepted 11 August 2018, Available online 4 September 2018, Version of Record 21 November 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.08.012