Deep convolutional self-paced clustering

作者:Rui Chen, Yongqiang Tang, Lei Tian, Caixia Zhang, Wensheng Zhang

摘要

Clustering is a crucial but challenging task in data mining and machine learning. Recently, deep clustering, which derives inspiration primarily from deep learning approaches, has achieved state-of-the-art performance in various applications and attracted considerable attention. Nevertheless, most of these approaches fail to effectively learn informative cluster-oriented features for data with spatial correlation structure, e.g., images. To tackle this problem, in this paper, we develop a deep convolutional self-paced clustering (DCSPC) method. Specifically, in the pretraining stage, we propose to utilize a convolutional autoencoder to extract a high-quality data representation that contains the spatial correlation information. Then, in the finetuning stage, a clustering loss is directly imposed on the learned features to jointly perform feature refinement and cluster assignment. We retain the decoder to avoid the feature space being distorted by the clustering loss. To stabilize the training process of the whole network, we further introduce a self-paced learning mechanism and select the most confident samples in each iteration. Through comprehensive experiments on seven popular image datasets, we demonstrate that the proposed algorithm can consistently outperform state-of-the-art rivals.

论文关键词:Deep clustering, Convolutional autoencoder, Local structure preservation, Self-paced learning

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02569-y