Scalable visual assessment of cluster tendency for large data sets

作者:

Highlights:

摘要

The problem of determining whether clusters are present in a data set (i.e., assessment of cluster tendency) is an important first step in cluster analysis. The visual assessment of cluster tendency (VAT) tool has been successful in determining potential cluster structure of various data sets, but it can be computationally expensive for large data sets. In this article, we present a new scalable, sample-based version of VAT, which is feasible for large data sets. We include analysis and numerical examples that demonstrate the new scalable VAT algorithm.

论文关键词:Clustering,Similarity measures,Cluster validity,Data visualization,Scalability

论文评审过程:Received 12 August 2005, Revised 6 February 2006, Accepted 10 February 2006, Available online 29 March 2006.

论文官网地址:https://doi.org/10.1016/j.patcog.2006.02.011