Diverse training dataset generation based on a multi-objective optimization for semi-Supervised classification

作者：

Highlights：

• A new method to avoid the lack of labeled data and increase the accuracy of semi-supervised classifications.

• Synthetic labeled data generation approach with low density (high diversity) and high classification accuracy.

• Optimization synthetic labeled instances with Non-dominated sorting genetic algorithm II (NSGA-II).

• Extensive experiments on 63 challenging datasets demonstrate the effectiveness of our approach.

摘要

•A new method to avoid the lack of labeled data and increase the accuracy of semi-supervised classifications.•Synthetic labeled data generation approach with low density (high diversity) and high classification accuracy.•Optimization synthetic labeled instances with Non-dominated sorting genetic algorithm II (NSGA-II).•Extensive experiments on 63 challenging datasets demonstrate the effectiveness of our approach.

论文关键词：Self-labeled,Semi-supervised learning,Evolutionary multi-objective optimization,Data density function,NSGA-II

论文评审过程：Received 27 April 2019, Revised 4 June 2020, Accepted 11 July 2020, Available online 12 July 2020, Version of Record 16 July 2020.

论文官网地址：https://doi.org/10.1016/j.patcog.2020.107543