Context prior-based with residual learning for face detection: A deep convolutional encoder–decoder network

Highlights：

• We introduce the receptive field to design the encoder subnetwork, and research the decoder subnetwork in detail, including its inner structure, computation of the anchor box, and the relationship between the feature cell and the anchor box.

• We further discuss and analyze some important factors which can affect the performance of the network, containing the scale of each hierarchical feature, anchor box size, and different training parameters.

摘要

•We discuss contextual semantic feature extracted by deep convolutional neural networks can be regarded as residual information to improve the performance of face detector. Therefore, a residual learning mechanism is introduced to model each encoder–decoder pair of the network.•We introduce the receptive field to design the encoder subnetwork, and research the decoder subnetwork in detail, including its inner structure, computation of the anchor box, and the relationship between the feature cell and the anchor box.•We further discuss and analyze some important factors which can affect the performance of the network, containing the scale of each hierarchical feature, anchor box size, and different training parameters.

论文评审过程：Received 28 September 2018, Revised 9 May 2020, Accepted 16 July 2020, Available online 25 July 2020, Version of Record 29 July 2020.