Collaborative fuzzy clustering of distributed concept-drifting dynamic data using a gossip-based approach

作者:Hoda Mashayekhi

摘要

Clustering is a useful method of analyzing large data sets, such as distributed data streams, which are increasingly observed in various applications. In this paper, a collaborative gossip-based approach is proposed for deriving a fuzzy clustering model of distributed dynamic data which involve concept drift. The proposed algorithm consists of local and collaborative phases. During the two phases, prototypes of data are constructed which constitute a summarized view of the distributed data. This summarized view enables each node to extract a custom subset of the overall clustering model. Scalability is achieved by using gossip as a robust method of communication, and also prevention of excessive data transfer among nodes. When concept drift is present, the clustering model incrementally evolves and outdated parts of the summarized view are removed. The experimental results, with different scenarios of data distribution, show that the proposed method can detect fuzzy clusters efficiently, and adapt with concept-drifting data, with bounded communication costs compared to other state of the art algorithms.

论文关键词:Distributed knowledge discovery, Dynamic data, Collaborative fuzzy clustering, Concept drift, Granular prototype, Gossip-based communication

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-018-1260-9