Automatic selection of heavy-tailed distributions-based synergy Henry gas solubility and Harris hawk optimizer for feature selection: case study drug design and discovery

作者:Mohamed Abd Elaziz, Dalia Yousri

摘要

Features Selection (FS) approaches have more attention since they have been applied to several fields primarily to deal with high dimensional data. An increase in the dimension of data can lead to degradation of the accuracy of the machine learning method. Therefore, there are several FS methods based on meta-heuristic (MH) techniques that have been developed to tackle the FS problem and avoid the limitations of traditional FS approaches. However, those MH methods still need improvements that suffer from some drawbacks that affect the quality of the final output. So, this paper proposed a modified Henry Gas Solubility Optimization (HGSO) using enhanced Harris hawks optimization (HHO) based on Heavy-tailed distributions (HTDs). In this study, a dynamical exchange between five HTDs is used to boost the HHO that modifies, in turn, the exploitation phase in HGSO. As a result, we proposed a dynamic modified HGSO based on enhanced HHO (DHGHHD). To assess the efficiency of the proposed DHGHHD, a set of eighteen UCI datasets are used. Furthermore, it applied to improve the prediction of two real-world datasets in the drug design and discovery field. The DHGHHD is compared with eight well-known MH methods. Comparison results illustrate the high quality of DHGHHD according to the values of accuracy, fitness value, and the number of selected features.

论文关键词:Feature selection (FS), Heavy-tailed distributions, Henry gas solubility, Harris hawks distribution, Drug design and discovery

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-021-10009-z