A multi-objective evolutionary approach to training set selection for support vector machine

作者:

Highlights:

摘要

The Support Vector Machine (SVM) is one of the most powerful algorithms for machine learning and data mining in numerous and heterogenous application domains. However, in spite of its competitiveness, SVM suffers from scalability problems which drastically worsens its performance in terms of memory requirements and execution time. As a consequence, there is a strong emergence of approaches for supporting SVM in efficiently addressing the aforementioned problems without affecting its classification capabilities. In this scenario, methods for Training Set Selection (TSS) represent a suitable and consolidated pre-processing technique to compute a reduced but representative training dataset, and improve SVM’s scalability without deprecating its classification accuracy. Recently, TSS has been formulated as an optimization problem characterized by two objectives (the classification accuracy and the reduction rate) and solved through the application of evolutionary algorithms. However, so far, all the evolutionary approaches for TSS have been based on a so-called multi-objective a priori technique, where multiple objectives are aggregated together into a single objective through a weighted combination. This paper proposes to apply, for the first time, a Pareto-based multi-objective optimization approach to the TSS problem in order to explicitly deal with both its objectives and offer a better trade-off between SVM’s classification and reduction performance. The benefits of the proposed approach are validated by a set of experiments involving well-known datasets taken from the UCI Machine Learning Database Repository. As shown by statistical tests, the application of a Pareto-based multi-objective optimization approach improves on state-of-the-art TSS techniques and enhances SVM efficiency.

论文关键词:Training set selection,Multi-objective optimization,Support vector machine

论文评审过程:Received 12 September 2017, Revised 8 February 2018, Accepted 12 February 2018, Available online 13 February 2018, Version of Record 28 February 2018.

论文官网地址:https://doi.org/10.1016/j.knosys.2018.02.022