Clustering based on matrix approximation: a unifying view

作者：Tao Li

摘要

Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. Recently, a number of methods have been proposed and demonstrated good performance based on matrix approximation. Despite significant research on these methods, few attempts have been made to establish the connections between them while highlighting their differences. In this paper, we present a unified view of these methods within a general clustering framework where the problem of clustering is formulated as matrix approximations and the clustering objective is minimizing the approximation error between the original data matrix and the reconstructed matrix based on the cluster structures. The general framework provides an elegant base to compare and understand various clustering methods. We provide characterizations of different clustering methods within the general framework including traditional one-side clustering, subspace clustering and two-side clustering. We also establish the connections between our general clustering framework with existing frameworks.

论文关键词：Clustering, Matrix approximation, Alternating optimization, Subspace

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10115-007-0116-0