A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes

作者:

Highlights:

摘要

In this paper, we propose a fuzzy SV-k-modes algorithm that uses the fuzzy k-modes clustering process to cluster categorical data with set-valued attributes. In the proposed algorithm, we use Jaccard coefficient to measure the dissimilarity between two objects and represent the center of a cluster with set-valued modes. A heuristic update way of cluster prototype is developed for the fuzzy partition matrix. These extensions make the fuzzy SV-k-modes algorithm can cluster categorical data with single-valued and set-valued attributes together and the fuzzy k-modes algorithm is its special case. Experimental results on the synthetic data sets and the three real data sets from different applications have shown the efficiency and effectiveness of the fuzzy SV-k-modes algorithm.

论文关键词:Categorical data,Set-valued attribute,Set-valued modes,Fuzzy k-modes,Fuzzy SV-k-modes

论文评审过程:Received 18 June 2016, Revised 7 September 2016, Accepted 28 September 2016, Available online 10 October 2016, Version of Record 10 October 2016.

论文官网地址:https://doi.org/10.1016/j.amc.2016.09.023