An effective classification procedure for diagnosis of prostate cancer in near infrared spectra

作者:

Highlights:

摘要

The main purpose of this study is to develop an effective classification procedure that discriminates between normal spectra and cancerous spectra in near infrared (NIR) spectroscopic data in which the classes are highly imbalanced and overlapped. Our proposed procedure consists of several steps. First, to ensure the comparability between spectra, normalization was done by dividing each spectral point by the area of the total intensity of the spectrum. Second, clustering analysis was performed with these normalized spectra to separate the spectra that represent the normal pattern from a mixed group that contains both normal and tumor spectra. Third, we conducted two-stage classification, the first being an effort to construct a classification model with the labels obtained from the preceding clustering analysis and the second being a classification to focus on the mixed group classified from the first classification model. To increase the accuracy, the second classification model was constructed based on the selected features that capture important characteristics of the spectral data. Our proposed procedure was evaluated by its classification ability in testing samples using a leave-one-out cross validation technique, yielding acceptable classification accuracy.

论文关键词:Classification,Class imbalance,Clustering,Near infrared,Prostate cancer

论文评审过程:Available online 24 November 2009.

论文官网地址:https://doi.org/10.1016/j.eswa.2009.11.032