Batch-normalized Mlpconv-wise supervised pre-training network in network

作者：Xiaomeng Han, Qun Dai

摘要

Deep multi-layered neural networks have nonlinear levels that allow them to represent highly varying nonlinear functions compactly. In this paper, we propose a new deep architecture with enhanced model discrimination ability that we refer to as mlpconv-wise supervised pre-training network in network (MPNIN). The process of information abstraction is facilitated within the receptive fields for MPNIN. The proposed architecture uses the framework of the recently developed NIN structure, which slides a universal approximator, such as a multilayer perceptron with rectifier units, across an image to extract features. However, the random initialization of NIN can produce poor solutions to gradient-based optimization. We use mlpconv-wise supervised pre-training to remedy this defect because this pre-training technique may contribute to overcoming the difficulties of training deep networks by better initializing the weights in all the layers. Moreover, batch normalization is applied to reduce internal covariate shift by pre-conditioning the model. Empirical investigations are conducted on the Mixed National Institute of Standards and Technology (MNIST), the Canadian Institute for Advanced Research (CIFAR-10), CIFAR-100, the Street View House Numbers (SVHN), the US Postal (USPS), Columbia University Image Library (COIL20), COIL100 and Olivetti Research Ltd (ORL) datasets, and the results verify the effectiveness of the proposed MPNIN architecture.

论文关键词：Deep learning (DL), Mlpconv-wise supervised pre-training network in network (MPNIN), Network in network (NIN) structure, Mlpconv layer, Batch normalization

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-017-0968-2