Gene selection of non-small cell lung cancer data for adjuvant chemotherapy decision using cell separation algorithm

作者:Najmeh Sadat Jaddi, Mohammad Saniee Abadeh

摘要

Since recommended treatment for Non-small cell lung cancer (NSCLC) after surgery is chemotherapy, the prediction of effectiveness or futileness of adjuvant chemotherapy (ACT) in early stage is important for future decision. Classification of NSCLC in gene expression data is performed to predict effectiveness or futileness of ACT. Selection of genes highly correlated with the class attribute, affects the classification accuracy. In this paper, a new cell separation algorithm is proposed which it imitates the action of cell separation using differential centrifugation process involving multiple centrifugation steps and increasing the rotor speed in each step. The CSA uses the application of centrifugal force to separate the solutions based on their objective function in different steps while the velocity is increased in each step. The CSA contributes to automatic trade-off between exploration and exploitation by control of selection rate during the search process. To examine the CSA, 25 test functions were used first and then the CSA was applied to predict effectiveness or futileness of ACT. The number of genes in candidate subsets is handled by increasing the subset size if after a certain number of iterations there is no improvement in fitness of the subset. This contributes to less time consideration and memory usage. In this experiment, the NSCLC data contain 280 samples collected from four institutes are used. As results, the minimum number of five genes with dependency degree equal to one and classification accuracy of higher than 94% for SVM, KNN and MLP classifiers is obtained.

论文关键词:Gene selection, Classification, Non-small cell lung cancer, Adjuvant chemotherapy decision, Cell separation algorithm

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-020-01740-1