Feature selection using localized generalization error for supervised classification problems using RBFNN

作者:

Highlights:

摘要

A pattern classification problem usually involves using high-dimensional features that make the classifier very complex and difficult to train. With no feature reduction, both training accuracy and generalization capability will suffer. This paper proposes a novel hybrid filter–wrapper-type feature subset selection methodology using a localized generalization error model. The localized generalization error model for a radial basis function neural network bounds from above the generalization error for unseen samples located within a neighborhood of the training samples. Iteratively, the feature making the smallest contribution to the generalization error bound is removed. Moreover, the novel feature selection method is independent of the sample size and is computationally fast. The experimental results show that the proposed method consistently removes large percentages of features with statistically insignificant loss of testing accuracy for unseen samples. In the experiments for two of the datasets, the classifiers built using feature subsets with 90% of features removed by our proposed approach yield average testing accuracies higher than those trained using the full set of features. Finally, we corroborate the efficacy of the model by using it to predict corporate bankruptcies in the US.

论文关键词:Feature selection,Neural network,Generalization error,RBFNN

论文评审过程:Received 1 March 2007, Revised 1 March 2008, Accepted 5 May 2008, Available online 15 May 2008.

论文官网地址:https://doi.org/10.1016/j.patcog.2008.05.004