An indication of unification for different clustering approaches

作者:

Highlights:

摘要

The question of finding generic concepts and properties common to the different clustering approaches is a current problem. This inquire is addressed most thoroughly in Kleinberg's paper on the Impossibility Theorem (see [1]). Kleinberg introduced the notion of clustering function — a function that takes a dissimilarity measure defined on a data set S and returns a partition of S; and a set of simple properties for the study of such functions — Scale Invariance, Richness and Consistency. The main result of [1] is the Impossibility Theorem: there is no clustering method satisfying all these properties. This study has been accepted as a rigorous proof of the difficulty in finding a unified framework for different clustering approaches.Our goal in this paper is to provide primary concepts and results for the formal study of the various clustering approaches. To accomplish this, we discuss and expand on the ideas introduced by Kleinberg. Our guiding philosophy is to incorporate a crucial fact overlooked in the study conducted in [1] — clustering methods not only depend on the dissimilarity measure but also on other parameters such as dissimilarity thresholds, centroids, stop criteria, among others. This paper gives a formal definition of clustering method and reformulates the afore-mentioned properties, even it introduces some new. Contrary to the result obtained in [1], many of the methods discussed here satisfy all of our properties. With all these grounds in hand we glimpse a clue of unification among the different clustering approaches.

论文关键词:Data clustering,Clustering function,Impossibility theorem

论文评审过程:Received 24 February 2012, Revised 15 February 2013, Accepted 27 February 2013, Available online 13 March 2013.

论文官网地址:https://doi.org/10.1016/j.patcog.2013.02.016