Sophisticated SOM based genetic operators in multi-objective clustering framework

作者:Naveen Saini, Sriparna Saha, Aditya Harsh, Pushpak Bhattacharyya

摘要

Multi-objective clustering refers to the partitioning of a given collection of objects into various K-groups based on some similarity/dissimilarity criterion while optimizing different partition quality measures simultaneously. The current paper proposes an automated decomposition based multi-objective clustering technique, SOMDEA_clust, which is a fusion of self-organizing map (SOM) and multi-objective differential evolution. A novel reproduction operator is designed where the ensemble of multiple neighborhoods extracted using self-organizing map is used for constructing the variable mating pool size. The probabilities of selecting different sizes of the neighborhood are updated based on their performances in generating new improved solutions in the last few generations. Decomposition based selection scheme is also utilized in our paper which divides the multi-objective optimization (MOO) problem into a number of single objective subproblems. The objective functions corresponding to these subproblems are optimized in a collaborative manner by the use of MOO. The potentiality of the proposed framework is shown for clustering four real-life data sets and five artificial data sets in comparison to some existing multi-objective based clustering techniques, namely MOCK, SMEA_clust, MEA_clust, a single objective based genetic clustering technique, SOGA and a traditional clustering technique, K-means. To show the utility of SOM based reproduction operators, another decomposition based multi-objective clustering technique (MDEA_clust) without the use of SOM based operators is also developed in this paper. In order to show the efficacy of the proposed clustering technique in handling large data sets, two large scale datasets having more than 5000 data points are also utilized. As a real-life application, the proposed clustering technique is applied for scientific/web document clustering where a set of scientific/web documents are partitioned based on their content-similarities. Semantic representation is utilized to covert the text document into a real vector. Experimental results clearly illustrate the effectiveness of fusion of SOM and DE in developing an effective clustering technique.

论文关键词:Clustering, Cluster validity indices, Self organizing map (SOM), Differential evolutionary algorithm (DE), Polynomial mutation, Multi-objective optimization (MOO)

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-018-1350-8