Model-based co-clustering for the effective handling of sparse data

作者:

Highlights:

• We propose a Sparse Poisson Latent Block Model (SPLBM) which has been designed to deal with data sparsity problems.

• We derive two EM-type co-clustering algorithms based on a variational approach.

• Extensive experiments show the effectiveness of the proposed algorithms on various challenging real-world text datasets.

• The obtained co-clusters are meaningful, semantically coherent and faith-fully follow the pattern of the real classes.

摘要

•We propose a Sparse Poisson Latent Block Model (SPLBM) which has been designed to deal with data sparsity problems.•We derive two EM-type co-clustering algorithms based on a variational approach.•Extensive experiments show the effectiveness of the proposed algorithms on various challenging real-world text datasets.•The obtained co-clusters are meaningful, semantically coherent and faith-fully follow the pattern of the real classes.

论文关键词:Mixture models,Poisson distribution,Latent block model,Co-clustering,Variational EM,Text data,Sparse data

论文评审过程:Received 10 December 2016, Revised 26 May 2017, Accepted 1 June 2017, Available online 3 July 2017, Version of Record 10 July 2017.

论文官网地址:https://doi.org/10.1016/j.patcog.2017.06.005