Gene selection and sample classification on microarray data based on adaptive genetic algorithm/k-nearest neighbor method

作者:

Highlights:

摘要

Recently, microarray technology has widely used on the study of gene expression in cancer diagnosis. The main distinguishing feature of microarray technology is that can measure thousands of genes at the same time. In the past, researchers always used parametric statistical methods to find the significant genes. However, microarray data often cannot obey some of the assumptions of parametric statistical methods, or type I error may be over expanded. Therefore, our aim is to establish a gene selection method without assumption restriction to reduce the dimension of the data set. In our study, adaptive genetic algorithm/k-nearest neighbor (AGA/KNN) was used to evolve gene subsets. We find that AGA/KNN can reduce the dimension of the data set, and all test samples can be classified correctly. In addition, the accuracy of AGA/KNN is higher than that of GA/KNN, and it only takes half the CPU time of GA/KNN. After using the proposed method, biologists can identify the relevant genes efficiently from the sub-gene set and classify the test samples correctly.

论文关键词:Gene selection,Sample classification,Adaptive genetic algorithm,k-Nearest neighbor,Microarray data

论文评审过程:Available online 15 August 2010.

论文官网地址:https://doi.org/10.1016/j.eswa.2010.07.053