Local graph based correlation clustering

作者:

Highlights:

摘要

In high-dimensional data, clusters often exist in the form of complex hierarchical relationships. In order to explore these relationships, there is a need to integrate dimensionality reduction techniques with data mining approaches and graph theory. The correlations in data points emerge more clearly if this integration is flawless. We propose an approach called Local Graph Based Correlation Clustering (LGBACC). This approach merges hierarchical clustering, with PCA to uncover complex hierarchical relationships, and uses graph models to visualize the results. We propose a framework of this approach that is divided into four phases. Each phase is flawlessly integrated with the next phase. Visualization of data after each phase is an important output and is knitted into the fabric of the framework. The focus of this technique remains on obtaining high quality clusters. The quality of the final clusters obtained is measured using standard indices. It is found that LGBACC is better than the existing hierarchical clustering approaches. We have used real-world data sets to validate our framework. These datasets test the approach on low as well as high-dimensional data. It is found that LGBACC produces high-quality clusters across a wide spectrum of dimensionality. Scalability test on synthetically produced high-dimensional and large datasets show that the proposed approach runs efficiently. Hence, LGBACC is an efficient and scalable approach that produces high-quality clusters in high-dimensional and large data spaces.

论文关键词:Correlation clustering,Dimensionality reduction,Graph analysis,Hierarchical clustering,Cluster quality

论文评审过程:Received 5 February 2017, Revised 26 September 2017, Accepted 30 September 2017, Available online 9 October 2017, Version of Record 13 November 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.09.034