Bootstrap technique in cluster analysis

作者:

Highlights:

摘要

We define a method to estimate the number of clusters in a data set E, using the bootstrap technique. This approach involves the generation of several “fake” data sets by sampling patterns with replacement in E (bootstrapping). For each number, K, of clusters, a measure of stability of the K-cluster partitions over the bootstrap samples is used to characterize the significance of the K-cluster partition for the original data set. The value of K which provides the most stable partitions is the estimate of the number of clusters in E. The performance of this new technique is demonstrated on both synthetic and real data, and is applied to the segmentation of range images.

论文关键词:Bootstrap,Pattern recognition,K-means clustering,Hierarchical clustering,Cluster validity,Image segmentation,Cluster tendency

论文评审过程:Received 16 September 1986, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(87)90081-1