Discriminant Deep Feature Learning based on joint supervision Loss and Multi-layer Feature Fusion for heterogeneous face recognition

摘要

Heterogeneous face recognition (HFR) is still a challenging problem in computer vision community due to large appearance difference between near infrared (NIR) and visible light (VIS) modalities. Recently, breakthroughs have been made for traditional face recognition by applying deep learning on a huge amount of labeled VIS face samples. However, the same deep learning approach cannot be simply applied to HFR task due to large domain difference as well as insufficient pairwise images in different modalities during training. In general, the pooling layer of deep network can play the role of feature reduction, but also lead to the loss of useful face information, resulting in a decrease in the performance of HFR problem. It is important to eliminate modal-related information and retain more facial identity information. In this paper, we propose a novel method called Discriminant Deep Feature Learning Based on Joint Supervision Loss and Multi-layer Feature Fusion (DDFLJM) for HFR task. In most of the available CNNs, the softmax loss function is used as the supervision signal to train the deep model. In order to enhance the discriminative power of the deeply learned features, this paper proposes a new loss function called Scatter Loss (SL), which embeds both inter- and intra-class information for effectively training the deep model. To make full use of the various layers of the deep network, a Dimension Reduction Block (DRB) is designed to effectively extract the auxiliary features on multiple mid-level layers. An orthogonality constraint is introduced to the DRB block to reduce spectrum variations of two different modalities. The proposed SL is applied to multiple layers of network for joint supervision training, which enables multiple layers of the network to obtain discriminative identity features. Moreover, a Modified Gate Two-stream Neural Network (MGTNN) is adopted to fuse multiple-layer features. Extensive experiments are carried out on two challenging NIR-VIS HFR datasets CASIA NIR-VIS 2.0 and Oulu-CASIA NIR-VIS, demonstrating the superiority of the proposed method.