A novel approach to feature extraction from classification models based on information gene pairs
作者:
Highlights:
•
摘要
Various microarray experiments are now done in many laboratories, resulting in the rapid accumulation of microarray data in public repositories. One of the major challenges of analyzing microarray data is how to extract and select efficient features from it for accurate cancer classification. Here we introduce a new feature extraction and selection method based on information gene pairs that have significant change in different tissue samples. Experimental results on five public microarray data sets demonstrate that the feature subset selected by the proposed method performs well and achieves higher classification accuracy on several classifiers. We perform extensive experimental comparison of the features selected by the proposed method and features selected by other methods using different evaluation methods and classifiers. The results confirm that the proposed method performs as well as other methods on acute lymphoblastic-acute myeloid leukemia, adenocarcinoma and breast cancer data sets using a fewer information genes and leads to significant improvement of classification accuracy on colon and diffuse large B cell lymphoma cancer data sets.
论文关键词:Feature extraction,Information gene pair,Microarray data,Cancer classification,Genetic algorithm
论文评审过程:Received 11 November 2006, Revised 19 September 2007, Accepted 20 November 2007, Available online 29 November 2007.
论文官网地址:https://doi.org/10.1016/j.patcog.2007.11.019