Vector quantization based approximate spectral clustering of large datasets
作者:
Highlights:
•
摘要
Spectral partitioning, recently popular for unsupervised clustering, is infeasible for large datasets due to its computational complexity and memory requirement. Therefore, approximate spectral clustering of data representatives (selected by various sampling methods) was used. Alternatively, we propose to use neural networks (self-organizing maps and neural gas), which are shown successful in quantization with small distortion, as preliminary sampling for approximate spectral clustering (ASC). We show that they usually outperform k-means sampling (which was shown superior to various sampling methods), in terms of clustering accuracy obtained by ASC. More importantly, for quantization based ASC, we introduce a local density-based similarity measure – constructed without any user-set parameter – which achieves accuracies superior to the accuracies of commonly used distance based similarity.
论文关键词:Spectral clustering,Large datasets,Vector quantization,Self-organizing maps,Neural gas,CONN similarity,Connectivity
论文评审过程:Received 11 May 2011, Revised 13 January 2012, Accepted 13 February 2012, Available online 24 February 2012.
论文官网地址:https://doi.org/10.1016/j.patcog.2012.02.012