Graph-based boosting algorithm to learn labeled and unlabeled data
作者:
Highlights:
• Some ensemble learning algorithms were proposed to exploit the information of unlabeled data. These methods had to learn the samples with pseudo-labels due to the scarcity of labeled data. But it is inevitable for the samples with pseudo-labels to bring wrong information during training process. In this paper, we propose a novel graph-based boosting algorithm (GBB) to learn labeled and unlabeled data. And pseudo-labels will not occur during training process.
• GBB is a framework combining many models linearly and the similarity matrix of samples is transformed during training process.
• We also extend GBB, termed as weighted GBB (WGBB), to learn imbalanced data.
• Experimental results illustrate that GBB can achieve a competitive performance and WGBB has an obvious advantage to handle classification problem of imbalanced data, comparing with other related algorithms.
摘要
•Some ensemble learning algorithms were proposed to exploit the information of unlabeled data. These methods had to learn the samples with pseudo-labels due to the scarcity of labeled data. But it is inevitable for the samples with pseudo-labels to bring wrong information during training process. In this paper, we propose a novel graph-based boosting algorithm (GBB) to learn labeled and unlabeled data. And pseudo-labels will not occur during training process.•GBB is a framework combining many models linearly and the similarity matrix of samples is transformed during training process.•We also extend GBB, termed as weighted GBB (WGBB), to learn imbalanced data.•Experimental results illustrate that GBB can achieve a competitive performance and WGBB has an obvious advantage to handle classification problem of imbalanced data, comparing with other related algorithms.
论文关键词:Graph,Boosting,Semi-supervised learning,Imbalance learning
论文评审过程:Received 24 November 2019, Revised 30 April 2020, Accepted 2 May 2020, Available online 5 May 2020, Version of Record 20 May 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107417