A fast k-means clustering algorithm using cluster center displacement

作者:

Highlights:

摘要

In this paper, we present a fast k-means clustering algorithm (FKMCUCD) using the displacements of cluster centers to reject unlikely candidates for a data point. The computing time of our proposed algorithm increases linearly with the data dimension d, whereas the computational complexity of major available kd-tree based algorithms increases exponentially with the value of d. Theoretical analysis shows that our method can reduce the computational complexity of full search by a factor of SF and SF is independent of vector dimension. The experimental results show that compared to full search, our proposed method can reduce computational complexity by a factor of 1.37–4.39 using the data set from six real images. Compared with the filtering algorithm, which is among the available best algorithms of k-means clustering, our algorithm can effectively reduce the computing time. It is noted that our proposed algorithm can generate the same clusters as that produced by hard k-means clustering. The superiority of our method is more remarkable when a larger data set with higher dimension is used.

论文关键词:k-Means clustering,Nearest-neighbor search,Knowledge discovery

论文评审过程:Received 21 August 2008, Revised 26 February 2009, Accepted 28 February 2009, Available online 12 March 2009.

论文官网地址:https://doi.org/10.1016/j.patcog.2009.02.014