Classifier chains for positive unlabelled multi-label learning

作者:

Highlights:

摘要

In traditional multi-label setting it is assumed that all relevant labels are assigned to the given instance. In positive unlabelled setting, only some of relevant labels are assigned. The appearance of a label means that the instance is really associated with this label, while the absence of the label does not imply that this label is not proper for the instance. For example, when predicting multiple diseases in one patient, some diseases can be undiagnosed however it does not mean that the patient does not have these diseases. Classifier chains are one of the most popular and successful methods used in standard multi-label classification, mainly due to their simplicity and high predictive power. However, it turns out that adaptation of classifier chains to positive unlabelled framework is not straightforward, due to the fact that the true target variables are observed only partially and therefore they cannot be used directly to train the models in the chain. The partial observability concerns not only the current target variable in the chain but also the feature space, which additionally increases the difficulty of the problem. In this paper we investigate the possibility of using classifier chains in positive unlabelled setting. We propose two methods in which we modify classifiers in the chain in order to take into account partial data observability. In the first method (called CCPU) we scale the output probabilities of the consecutive classifiers in the chain. In the second method (called CCPUW) we minimize weighted empirical risk, with weights depending on prior probabilities of the target variables. Moreover, both methods use modified feature spaces. The predictive performance of the proposed methods is studied on real multi-label datasets for different positive unlabelled settings.

论文关键词:Classifier chains,Multi-label classification,Positive-unlabelled learning

论文评审过程:Received 26 June 2020, Revised 19 November 2020, Accepted 17 December 2020, Available online 28 December 2020, Version of Record 28 December 2020.

论文官网地址:https://doi.org/10.1016/j.knosys.2020.106709