Application of active learning in DNA microarray data for cancerous gene identification

作者:

Highlights:

摘要

Microarray technology has an important role in evaluating gene expression data with unique patterns into existence. In gene-expression based experiments, the expression level of the gene is constantly monitored in order to classify a tissue sample. In microarray technology, the expressions of the genes are altered with respect to pathogenes. The altered expression values can be identified by analyzing the genes of the tissue/cell that are affected along with the tissues/cells that are unaffected are termed as biomarkers. In the current paper, we have developed an Active Learning (AL) model by using Support Vector Machine (SVM) in association with feature-selection (FS) algorithm; called Symmetrical Uncertainty (SU) for the prediction of cancer. The effectiveness of the proposed AL and SU combination is manifested and the biomarkers or cancerous genes identified by the proposed method on four gene-expression data sets are reported. In addition, the biological significance tests are performed for the cancer biomarkers obtained from the data sets.

论文关键词:Active learning,Biomarker,Cancer prediction,Microarray data,Symmetrical uncertainty,SVM

论文评审过程:Received 23 May 2020, Revised 29 July 2020, Accepted 13 March 2021, Available online 29 March 2021, Version of Record 12 April 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.114914