Weighted multi-view co-clustering (WMVCC) for sparse data

作者:Syed Fawad Hussain, Khadija Khan, Rashad Jillani

摘要

Multi-view clustering has gained importance in recent times due to the large-scale generation of data, often from multiple sources. Multi-view clustering refers to clustering a set of objects which are expressed by multiple set of features, known as views, such as movies being expressed by the list of actors or by a textual summary of its plot. Co-clustering, on the other hand, refers to the simultaneous grouping of data samples and features under the assumption that samples exhibit a pattern only under a subset of features. This paper combines multi-view clustering with co-clustering and proposes a new Weighted Multi-View Co-Clustering (WMVCC) algorithm. The motivation behind the approach is to use the diversity of features provided by multiple sources of information while exploiting the power of co-clustering. The proposed method expands the clustering objective function to a unified co-clustering objective function across all the multiple views. The algorithm follows the k-means strategy and iteratively optimizes the clustering by updating cluster labels, features, and view weights. A local search is also employed to optimize the clustering result using weighted multi-step paths in a graph. Experiments are conducted on several benchmark datasets. The results show that the proposed approach converges quickly, and the clustering performance significantly outperforms other recent and state-of-the-art algorithms on sparse datasets.

论文关键词:Information fusion, Clustering, Co-clustering, Multi-view clustering

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-021-02405-3