A sparse logistic regression framework by difference of convex functions programming

作者:Liming Yang, Yannan Qian

摘要

Feature selection for logistic regression (LR) is still a challenging subject. In this paper, we present a new feature selection method for logistic regression based on a combination of the zero-norm and l 2-norm regularization. However, discontinuity of the zero-norm makes it difficult to find the optimal solution. We apply a proper nonconvex approximation of the zero-norm to derive a robust difference of convex functions (DC) program. Moreover, DC optimization algorithm (DCA) is used to solve the problem effectively and the corresponding DCA converges linearly. Compared with traditional methods, numerical experiments on benchmark datasets show that the proposed method reduces the number of input features while maintaining accuracy. Furthermore, as a practical application, the proposed method is used to directly classify licorice seeds using near-infrared spectroscopy data. The simulation results in different spectral regions illustrates that the proposed method achieves equivalent classification performance to traditional logistic regressions yet suppresses more features. These results show the feasibility and effectiveness of the proposed method.

论文关键词:Logistic regression, Feature selection, Zero-norm, DC programming

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-016-0758-2