A new clustering algorithm with multiple runs of iterative procedures

作者：

Highlights：

•

摘要

The K-means clustering algorithm is well known and widely used. Recently, an alternative algorithm based on an earlier idea (Duda and Hart, Pattern Classification and Scene Analysis (1973); Ismail and Kamel, Pattern Recognition22, 75–89 (1989)) has been proposed; in comprehensive experiments we verify that this alternative is superior to K-means. Both techniques suffer from the complexity of the cost function surface they are descending, and a consequent inclination to fall into poor local minima. A popular solution to this problem is to rerun the chosen iterative procedure from many randomly chosen starting positions, saving the best. We introduce here an alternative strategy to form the starting position for such a multiple run which consists of perturbing an existing result in an attempt to improve its poor aspects. This “controlled configuration change” has comparable complexity with random initialisation but usually produces superior clustering results. Comprehensive experiments confirm this result.

论文关键词：Clustering problem,Configuration,K-means,Random initialisation,Cost minimisation,Global minimum

论文评审过程：Received 1 June 1990, Revised 30 January 1991, Available online 19 May 2003.

论文官网地址：https://doi.org/10.1016/0031-3203(91)90003-N