Mutual calibration training: Training deep neural networks with noisy labels using dual-models

作者:

Highlights:

摘要

A precise large-scale dataset is crucial for supervising the training of deep neural networks (DNNs) in image classification. However, manually annotating large-scale dataset is time-consuming, which limits the scalability of supervised training. On the other hand, it is relatively easier to obtain a small clean dataset as well as a vast amount of data with noisy labels, but training on a noisy dataset causes the performance of deep networks dropping dramatically. To overcome this problem, this work studies how to effectively and efficiently train deep works on the noisy large-scale dataset in conjunction with a small clean dataset. One problem with transfer learning from a small clean dataset is that the transfer learning technique risks over-fitting on the clean dataset due to more parameters than the training examples. Hence, we propose a new approach, called online easy example mining (OEEM) to train deep network on the entire noisy dataset. OEEM aims to select clean samples to guide the training without human annotation by estimating the confidence of the observed labels with the model prediction. However, the sample-selection bias in the OEEM can trap the model into a locally optimal value. Consequently, we propose a general framework called Mutual Calibration Training (MCT) against different noise levels and noise types using dual-models, which combines the idea of transfer learning and OEEM. Finally, we conduct experiments on synthetic as well as real-world dataset with different noise types and noise rates, respectively. And the results demonstrate effectiveness of our approach.

论文关键词:

论文评审过程:Received 23 January 2021, Revised 3 August 2021, Accepted 4 September 2021, Available online 13 September 2021, Version of Record 29 September 2021.

论文官网地址:https://doi.org/10.1016/j.cviu.2021.103277