Fast instance selection for speeding up support vector machines

作者:

Highlights:

摘要

Support vector machine (SVM) has shown prominent performance for binary classification. How to effectively apply it to massive datasets with large number of classes and instances is still a serious challenge. Instance selection methods have been proposed and shown significant efficacy for reducing the training complexity of SVM, but more or less trade off the generalization performance. This paper presents an instance selection method especially for multi-class problems. With cluster centers of positive class as reference points instances are selected for each one-versus-rest SVM model. The purpose of clustering here is to improve the efficiency of instance selection, other than to select instances directly from clusters as previous methods did. Experiments on a wide variety of datasets demonstrate that the proposed method selects fewer instances than most competitive algorithms and keeps the highest classification accuracy on most datasets. Additionally, experimental results show that this method also performs superiorly for binary problems.

论文关键词:SVM,Classification,Multi-class,Instance selection,Clustering

论文评审过程:Received 18 July 2012, Revised 20 January 2013, Accepted 25 January 2013, Available online 16 February 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.01.031