Adaptive sampling using self-paced learning for imbalanced cancer data pre-diagnosis

作者:

Highlights:

• An imbalanced sampling via self-paced learning named ISPL is developed in this paper.

• ISPL effectively selects samples from high-confidence to low-confidence to make the balanced dataset.

• The performance is higher than competitors i.e. ENN, SMOTE, One-Sided Selection, etc.

• The model selects some highly relevant genes for early prognosis of cancer diseases.

摘要

•An imbalanced sampling via self-paced learning named ISPL is developed in this paper.•ISPL effectively selects samples from high-confidence to low-confidence to make the balanced dataset.•The performance is higher than competitors i.e. ENN, SMOTE, One-Sided Selection, etc.•The model selects some highly relevant genes for early prognosis of cancer diseases.

论文关键词:Imbalanced classification,Adaptive sampling,Cancer pre-diagnosis,Elastic-net regularization

论文评审过程:Received 20 April 2019, Revised 20 February 2020, Accepted 20 February 2020, Available online 22 February 2020, Version of Record 21 March 2020.

论文官网地址:https://doi.org/10.1016/j.eswa.2020.113334