A study on semi-supervised FCM algorithm

作者:Shan Zeng, Xiaojun Tong, Nong Sang, Rui Huang

摘要

Most variants of fuzzy c-means (FCM) clustering algorithms involving prior knowledge are generally based on the modification of the objective function or the clustering process. This paper proposes a new weighted semi-supervised FCM algorithm (SSFCM-HPR) that transforms the prior knowledge in the labeled samples into constraint conditions in terms of fuzzy membership degrees, assigns different weights according to the representativeness of the samples, and then uses the HPR multiplier to solve the clustering problem. The “representativeness” of the labeled samples is decided by their distances to the cluster centers they belong to. In this paper, we take the ratio of the largest to the second largest fuzzy membership degree from a labeled sample as its weight. This algorithm not only retains the fuzzy partition of the labeled samples, which guarantees the effective guidance on the clustering process, but also can detect whether a sample is an outlier or not. Moreover, when part of the supervised information of the labeled samples is wrong, this algorithm can reduce the influence of the incorrectly labeled samples on the final clustering results. The experimental evaluation on synthetic and real data sets demonstrates the efficiency and effectiveness of our approach.

论文关键词:FCM algorithm, Semi-supervised clustering, HPR multiplier, Incorrectly labeled samples

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-012-0521-x