Enhancing Face Recognition from Massive Weakly Labeled Data of New Domains

作者:Wei Xu, Junyu Wu, Shengyong Ding, Linggan Lian, Hongyang Chao

摘要

Training data are critical in face recognition systems. Labeling a large scale dataset for a particular domain needs lots of manpower. Without dataset related to current face recognition domain, we can’t get a strong face recognition model with existing public datasets. In this paper, we propose a semi-supervised method to automatically construct strong dataset which can be trained to achieve better performance on the target domain from massive weakly labeled data. In the case of Asian face recognition, a well trained VRCN model by CASIA, which achieves 98.63% on LFW and 91.76% on YTF, only achieves 88.53% recognition rate on our test dataset of Asian faces. We collect 530,560 weakly labeled Asian face images of 7962 identities, and get a cleaned dataset of size 285,933. Model trained by the cleaned dataset with VRCN network and same strategy achieves 95.33% recognition rate on the Asian face test dataset (6.8% improved).

论文关键词:Face recognition, Dataset construction, Model enhancing

论文评审过程:

论文官网地址:https://doi.org/10.1007/s11063-018-9839-z