Fast semi-supervised clustering with enhanced spectral embedding

作者:

Highlights:

摘要

In recent years, semi-supervised clustering (SSC) has aroused considerable interests from the machine learning and data mining communities. In this paper we propose a novel SSC approach with enhanced spectral embedding (ESE), which not only considers the geometric structure information contained in data sets, but also can make use of the given side information such as pairwise constraints. Specially, we first construct a symmetry-favored k-NN graph, which is highly robust to noise and outliers, and can reflect the underlying manifold structures of data sets. Then we learn the enhanced spectral embedding towards an ideal data representation as consistent with the given pairwise constraints as possible. Finally, by using the regularization of spectral embedding we formulate learning the new data representation as a semidefinite-quadratic-linear programming (SQLP) problem, which can be efficiently solved. Experimental results on a variety of synthetic and real-world data sets show that our ESE approach outperforms the state-of-the-art SSC algorithms in terms of speed and quality on both vector-based and graph-based clustering.

论文关键词:Semi-supervised clustering (SSC),Side information,Spectral embedding,Pairwise constraint,Semantic gap

论文评审过程:Received 5 January 2011, Revised 7 May 2012, Accepted 11 May 2012, Available online 31 May 2012.

论文官网地址:https://doi.org/10.1016/j.patcog.2012.05.007