A data mining-based subset selection for enhanced discrimination using iterative elimination of redundancy

作者:

Highlights:

摘要

The presence of redundant or irrelevant features in data mining may result in a mask of underlying patterns. Thus one often reduces the number of features by applying a feature selection technique. The objective of feature selection is to get a feature subset that has the best performance. This work proposes a new feature selection method using orthogonal filtering and nonlinear representation of data for an enhanced discrimination performance. An orthogonal filtering is implemented to remove unwanted variation of data. The proposed method adopts kernel principal component analysis, one of nonlinear kernel methods, to extract nonlinear characteristics of data and to reduce the dimensionality of data. The proposed feature selection method is based on the selection criterion of linear discriminant analysis in an environment of iterative backward feature elimination. The performance of the proposed method is compared with those of three different methods. The results showed that it outperforms the three methods. The use of filtering and a kernel method was shown to be a promising tool for an efficient feature selection.

论文关键词:Variable selection,Support vector machine,Backward elimination,Kernel principal component analysis,Discriminant analysis

论文评审过程:Available online 5 December 2007.

论文官网地址:https://doi.org/10.1016/j.eswa.2007.11.020