NMFE-SSCC: Non-negative matrix factorization ensemble for semi-supervised collective classification

作者:

Highlights:

摘要

Collective classification (CC) is a task to jointly classifying related instances of network data. Enabling CC usually improves the performance of predictive models on fully-labeled training networks with large amount of labeled data. However, acquiring such labels can be difficult and costly, and learning a CC classifier with only a few labeled data can lead to poor performance. On the other hand, there are usually large amount of unlabeled data available in practical. This naturally motivates semi-supervised collective classification (SSCC) approaches for leveraging the unlabeled data to improve CC from a sparsely-labeled network. In this paper, we propose a novel non-negative matrix factorization (NMF) based SSCC algorithm, called NMF-SSCC, to effectively learn a data representation by exploiting both labeled and unlabeled data on the network. Our idea is to use matrix factorization to obtain a compact representation of network data which uncovers the class discrimination of the data inferred from the labeled instances and simultaneously respects the intrinsic network structure. To achieve this, we design a new matrix factorization objective function and incorporate a label matrix factorization term as well as a network regularization term into it. An efficient optimization algorithm using the multiplicative updating rules is then developed to solve the new objective function. To further boost the predicting performance, we extend the proposed NMF-SSCC method into an ensemble scheme, called NMFE-SSCC, in terms of building a classification ensemble with a set of NMF-SSCC collective classifiers using different constructed latent graphs. Each NMF-SSCC classifier is learnt from one latent graph generated with various latent linkages for effectively label propagation. Experimental results on real-world data sets have demonstrated the effectiveness of the new methods.

论文关键词:Collective classification,Non-negative matrix factorization,Semi-supervised collective classification,Ensemble classification

论文评审过程:Received 20 November 2014, Revised 22 April 2015, Accepted 14 June 2015, Available online 17 July 2015, Version of Record 19 October 2015.

论文官网地址:https://doi.org/10.1016/j.knosys.2015.06.026