Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines

作者:Syed Muhammad Saqlain, Muhammad Sher, Faiz Ali Shah, Imran Khan, Muhammad Usman Ashraf, Muhammad Awais, Anwar Ghani

摘要

Heart is one of the essential operating organs of the human body and its failure is a major contributing factor toward the human deaths. Coronary heart disease may be asymptotic but can be anticipated through the medical tests and daily life routine of the subject. Diagnosis of the coronary heart disease needs a specialized medical resource with the plenty of experience. All over the world and particularly in the developing countries, there is a lack of such experts which make the diagnosis more difficult. In this paper, we present a clinical heart disease diagnostic system by proposing feature subset selection methodology with an object of achieving improved performance. The proposed methodology presents three algorithms for selecting candidate feature subsets: (1) mean Fisher score-based feature selection algorithm, (2) forward feature selection algorithm and (3) reverse feature selection algorithm. Feature subset selection algorithm is presented to select the most decisive subset from the candidate feature subsets. The features are added to the feature subsets on the basis of their individual Fisher scores, while the selection of a feature subset depends on its Matthews correlation coefficient score and dimension. The selected feature subset with the reduced dimension is fed to the RBF kernel-based SVM which results in binary classification: (1) heart disease patient and (2) normal control subject. The proposed methodology is validated through accuracy, specificity and sensitivity using four UCI datasets, i.e., Cleveland, Switzerland, Hungarian and SPECTF. The statistical results achieved using the proposed technique are shown in comparison with the existing techniques reflecting its better performance. It has an accuracy of 81.19, 84.52, 92.68 and 82.7% for Cleveland, Hungarian, Switzerland and SPECTF, respectively.

论文关键词:Heart disease, Feature selection, Fisher score, SVM, RBF

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10115-018-1185-y