An analysis of the effects of sample size on classification performance of a histogram based cluster analysis procedure

作者:

Highlights:

摘要

A non-parametric unsupervised program was developed to identify clusters in multidimensional data by mode analysis using histograms. An implicit assumption in the histogram approach is that a relatively large number of samples is required to insure an accurate classification. Tests with randomly generated data show that the assumption is not true, i.e. a small number of samples does not necessarily result in a poor classification, nor does a relatively large number of samples guarantee the best classification. The histogram classifier was compared to two parametric classifiers, maximum likelihood and K-means clustering. Results from timing the classifiers show that, although the parametric classifiers are more efficient for a small number of samples, the histogram approach uses less CPU time for a large number of samples.

论文关键词:Image Processing,Cluster Analysis,Histograms

论文评审过程:Received 23 November 1982, Revised 29 April 1983, Accepted 20 May 1983, Available online 19 May 2003.

论文官网地址:https://doi.org/10.1016/0031-3203(84)90062-1