A combined algorithm for weighting the variables and clustering in the clustering problem

摘要

One problem in clustering (classification) analysis relates to whether or not the original variables should be transformed in some way before they are used by the clustering algorithm. More often than not, the original variables do require some transformation. The purpose of the transformation may be a desire to have more compact clusters in the space of the transformed variables, to take into account the different nature and/or units of the variables involved, to allow for the different or equal ‘importance’ of different variables, to minimize the number of variables used, etc. Among the linear transformations of variables we distinguish two groups - those which change only the scales of the variables (they are often called weighting procedures), and those which also rotate the space of variables (a good example would be the method of principal components(1)). This paper addresses the former group of transformations.One strong reason for using the weighted variables (as opposed to their linear combinations) is that when using them one can interpret the results of the classification in terms of the original (physical) variables. Unfortunately, weighting the variables can result in ‘spoiling’ the compactness of the clusters in the space of the weighted variables if the weighting procedure being used ‘does not care’ about the results of clustering (in other words if the weighting is done prior to and independently of the clustering).A method of weighting the variables which is a part of the classification procedure and thus guarantees an improvement of the cluster clarity is suggested in this paper. The weights of variables and the clusters of objects produced by the algorithm correspond to a local minimum of some classification criterion. Because of this, the resultant weights can be interpreted as a measure of ‘importance’ of the variables for the classification purpose. These weights are compared with such popular weighting procedures as equal variance(6) and Mahalanobis distance(7) methods. Two examples of the performance of the algorithm are presented.