Evolutionary based ensemble framework for realizing transfer learning in HIV-1 Protease cleavage sites prediction

作者:Deepak Singh, Pradeep Singh, Dilip Singh Sisodia

摘要

The role of human immunodeficiency virus (HIV) protease in viral maturation is indispensable as the drug therapy primarily targets the HIV protease for the treatment of human immunodeficiency virus infection. Protease inhibitors are designed to block the active site of the protease, thereby restraining the replication of the viral particle. However, designing efficient inhibitors is challenging due to little or no similarity of the sequence among the cleaved sites and availability of few experimentally-verified sites. In order to learn the sequence structure, support vector machines have been comprehensively used however insufficient training data degrades the performance. Thus, a cross-domain approach is adopted by the proposed ensemble model for predicting the HIV-1 protease cleavage sites. In this study, a method for combining multiple weighted classifiers optimally by incorporating the knowledge derived from various amino acid encoding techniques is proposed. As a result, each classifier pair with a specific type of heterogeneous information which is generated by the different encoding method, and the final prediction could be obtained by aggregating the locally trained classifiers. The optimally coupled sequence of features and classifiers that characterized the heterogeneous feature is achieved promptly by genetic algorithm. Furthermore, the efficiency of the model is verified by the tests conducted on the four HIV-1 protease datasets offered at UCI machine learning database. The performance parameters such as average accuracy, standard deviation, and area under curve have been evaluated on the proposed model to justify the advancements over the other state- of-the-art methods. In addition, Friedman and post hoc tests were conducted to show the significant improvement achieved by the proposed framework. These results quantified the enhancement of the proposed ensemble model performance.

论文关键词:Cross-domain adaptation, Ensemble learner, Genetic algorithm, HIV-1 proteases, Transfer learning

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-018-1323-y