Mix-ratio sampling: Classifying multiclass imbalanced mouse brain images using support vector machine

作者:

Highlights:

摘要

Support Vector Machine (SVM) is a classifier designed to achieve optimized classification accuracy. It has been applied to numerous applications associated with images. Yet challenges remain when applying SVM on segmenting mouse brain images. This is due to the fact that each high-resolution mouse brain image is a very large data set and it is a multiclass classification problem with extremely imbalanced data size for different classes. To address these issues, a mix-ratio sampling approach for SVM is proposed which determines various over-sampling ratios for different minority classes. In addition, to improve the imaging classification accuracy, spatial information is incorporated into the classification problem. Five mouse Magnetic Resonance Microscopy (MRM) images are collected to test the accuracy of classifying 21 brain structures. The first comparison experiment demonstrates the SVM with mix-ratio sampling method relieves the imbalance problem for multiclass more effectively and efficiently than the SVM with simple over-sampling method. In the second comparison experiment, another classifier, Artificial Neural Network (ANN) is used to compare against SVM based on the same mix-ratio sampled data and the results indicate that SVM shows better classification performance than ANN. Thirdly, the cross validation is conducted to demonstrate SVM with mix-ration sampling can classify multiclass imbalanced data with high accuracy.

论文关键词:Sampling procedure,Imbalanced dataset,Multiclass classification,Support vector machine,Data mining,Brain image segmentation

论文评审过程:Available online 9 December 2009.

论文官网地址:https://doi.org/10.1016/j.eswa.2009.12.018