A reduced data set method for support vector regression

作者：

Highlights：

•

摘要

Support vector regression (SVR) has been very successful in pattern recognition, text categorization, and function approximation. The theory of SVR is based on the idea of structural risk minimization. In real application systems, data domain often suffers from noise and outliers. When there is noise and/or outliers exist in sampling data, the SVR may try to fit those improper data, and obtained systems may have the phenomenon of overfitting. In addition, the memory space for storing the kernel matrix of SVR will be increment with O(N2), where N is the number of training data. Hence, for a large training data set, the kernel matrix cannot be saved in the memory. In this paper, a reduced support vector regression is proposed for nonlinear function approximation problems with noise and outliers. The core idea of this approach is to adopt fuzzy clustering and a robust fuzzy c-means (RFCM) algorithm to reduce the computational time of SVR and greatly mitigates the influence of data noise and outliers.

论文关键词：Support vector regression,Outlier,Fuzzy clustering,Robust fuzzy c-means

论文评审过程：Available online 7 May 2010.

论文官网地址：https://doi.org/10.1016/j.eswa.2010.04.062