Brain tumor segmentation based on the dual-path network of multi-modal MRI images
作者:
Highlights:
• The main problems in traditional deep learning network are as follows:.○Due to the loss of some boundary features in the convolution process, the feature maps obtained by the up-sampling operation are incomplete. Even if the low-level and high-level features are connected by skip connection, the lost information cannot be retrieved. Besides, the algorithm usually ignores the features between patches, which results in the lack of coherent features.○Because of the convolution kernel with different sizes, it is easy to appear the phenomenon of uneven overlap in the sampling process. When the kernel size cannot be completely divided by the step size, it will fill the details into the low-resolution images. Especially when the multi-modal brain tumor images interact with each other, the over-lapping redundancy will be magnified to a greater extent, which results in a waste of time and space. Besides, it is easy to cause vanishing gradient or exploding gradient.○Due to different sizes of convolution kernel, the diversity of the feature maps transformation and receptive domain information is different. Moreover, the deep learning network lacks context information and local receptive domain features, which can not achieve interlayer and intralayer feature fusion well. Thus, it may cause the poor classification precision of predicted pixels.To overcome the shortcomings of the above network and improve the precision of brain tumor segmentation, this paper proposes a new deep learning network model named dual-path network based on multi-modal feature fusion (MFF-DNet). The innovations of the proposed model are as follows:.○The multi-modal MRI images and the mask images are trained by using different convolution kernels. The combination of different kernels is to obtain more large-scale features of brain tumors. It can reduce the number of training parameters and the loss of information in the sampling process. Meanwhile, it can enhance the effectiveness and coherence of information flow.○In the process of training, the single-modal information is regarded as a feature channel by the residual connection and the dense connection. Through training multi-modal patch dataset, a more accurate contrast structure between physiological tissue and soft tissue is obtained. To alleviate the influence of multi-modal channels, the over-lapping frequency is reduced. Besides, the fusion features are mapped from the high resolution to the low resolution in the up-sampling process, which can reduce the over-lapping impact by centralizing weight.○A dual-path network based on the densenet and FPN is established. The densenet network is used to fuse the low-level and middle-level features. Here, the low-level features contain a narrow receptive region but rich texture information. The middle-level features contain larger receptive region information. This process can avoid the loss of local structure and fine boundary information, which increases the robustness of the proposed model. Through the FPN network, the combination of middle-level information and its corresponding high-level features can highlight the structure of the lesion. It also increases the diversification of the glioma non-linear structure features and improves the segmentation precision.
摘要
•The main problems in traditional deep learning network are as follows:.○Due to the loss of some boundary features in the convolution process, the feature maps obtained by the up-sampling operation are incomplete. Even if the low-level and high-level features are connected by skip connection, the lost information cannot be retrieved. Besides, the algorithm usually ignores the features between patches, which results in the lack of coherent features.○Because of the convolution kernel with different sizes, it is easy to appear the phenomenon of uneven overlap in the sampling process. When the kernel size cannot be completely divided by the step size, it will fill the details into the low-resolution images. Especially when the multi-modal brain tumor images interact with each other, the over-lapping redundancy will be magnified to a greater extent, which results in a waste of time and space. Besides, it is easy to cause vanishing gradient or exploding gradient.○Due to different sizes of convolution kernel, the diversity of the feature maps transformation and receptive domain information is different. Moreover, the deep learning network lacks context information and local receptive domain features, which can not achieve interlayer and intralayer feature fusion well. Thus, it may cause the poor classification precision of predicted pixels.To overcome the shortcomings of the above network and improve the precision of brain tumor segmentation, this paper proposes a new deep learning network model named dual-path network based on multi-modal feature fusion (MFF-DNet). The innovations of the proposed model are as follows:.○The multi-modal MRI images and the mask images are trained by using different convolution kernels. The combination of different kernels is to obtain more large-scale features of brain tumors. It can reduce the number of training parameters and the loss of information in the sampling process. Meanwhile, it can enhance the effectiveness and coherence of information flow.○In the process of training, the single-modal information is regarded as a feature channel by the residual connection and the dense connection. Through training multi-modal patch dataset, a more accurate contrast structure between physiological tissue and soft tissue is obtained. To alleviate the influence of multi-modal channels, the over-lapping frequency is reduced. Besides, the fusion features are mapped from the high resolution to the low resolution in the up-sampling process, which can reduce the over-lapping impact by centralizing weight.○A dual-path network based on the densenet and FPN is established. The densenet network is used to fuse the low-level and middle-level features. Here, the low-level features contain a narrow receptive region but rich texture information. The middle-level features contain larger receptive region information. This process can avoid the loss of local structure and fine boundary information, which increases the robustness of the proposed model. Through the FPN network, the combination of middle-level information and its corresponding high-level features can highlight the structure of the lesion. It also increases the diversification of the glioma non-linear structure features and improves the segmentation precision.
论文关键词:Brain tumor segmentation,Deep learning,Dual-path model,Magnetic resonance imaging,Multi-modal images
论文评审过程:Received 28 June 2021, Revised 18 October 2021, Accepted 15 November 2021, Available online 17 November 2021, Version of Record 28 February 2022.
论文官网地址:https://doi.org/10.1016/j.patcog.2021.108434