PLS-based recursive feature elimination for high-dimensional small sample

作者:

Highlights:

摘要

This paper focused on feature selection for high-dimensional small samples (HDSS). We first presented a general analytical framework for feature selection on a HDSS including selection strategy (single-feature ranking and multi-feature ranking) and evaluation criteria (feature subset consistency and compactness). Then we proposed partial least squares (PLS) based feature selection methods for HDSS and two theorems. The proposed methodologies include a PLS model for classification, parameter selection, PLSRanking, and PLS-based recursive feature elimination. Furthermore, we compared our proposed methods with several existing feature selection methods such as Support Vector Machine (SVM) based feature selection, SVM-based recursive feature elimination (SVMRFE), Random Forest (RF) based feature selection, RF-based recursive feature elimination (RFRFE), ReliefF algorithm and ReliefF-based recursive feature elimination (ReliefFRFE). Using twelve high-dimensional datasets from different areas of research, we evaluated the results in terms of accuracy (sensitivity and specificity), running time, and the feature subset consistency and compactness. The analysis demonstrated that the proposed approach from our research performed very well when handling both two-category and multi-category problems.

论文关键词:High-dimensional small samples (HDSS),Partial least squares (PLS),Recursive feature elimination (RFE),Feature subset consistency,Feature subset compactness

论文评审过程:Received 13 January 2013, Revised 29 September 2013, Accepted 1 October 2013, Available online 14 October 2013.

论文官网地址:https://doi.org/10.1016/j.knosys.2013.10.004