A new framework for identifying differentially expressed genes

作者:

Highlights:

摘要

Microarrays have been widely used to classify cancer samples and discover the biological types, for example tumor versus normal phenotypes in cancer research. One of the challenging scientific tasks in the post-genomic epoch is how to identify a subset of differentially expressed genes from thousands of genes in microarray data which will enable us to understand the underlying molecular mechanisms of diseases, accurately diagnosing diseases and identifying novel therapeutic targets. In this paper, we propose a new framework for identifying differentially expressed genes. In the proposed framework, genes are ranked according to their residuals. The performance of the framework is assessed through applying it to several public microarray data. Experimental results show that the proposed method gives more robust and accurate rank than other statistical test methods, such as t-test, Wilcoxon rank sum test and KS-test. Another novelty of the method is that we design an algorithm for selecting a small subset of genes that show significant variation in expression (“outlier” genes). The number of genes in the small subset can be controlled via an alterable window of confidence level. In addition, the results of the proposed method can be visualized. By observing the residual plot, we can easily find genes that show significant variation in two groups of samples and learn the degrees of differential expression of genes. Through a comparison study, we found several “outlier” genes which had been verified in previous biological experiments while they were either not identified by other methods or had lower ranks in standard statistical tests.

论文关键词:Microarray data,t-Test,Wilcoxon rank sum test,KS-test,Differentially expressed genes,“Outlier” gene,Regression model,Window of confidence level

论文评审过程:Received 22 March 2006, Revised 31 January 2007, Accepted 31 January 2007, Available online 24 February 2007.

论文官网地址:https://doi.org/10.1016/j.patcog.2007.01.032