A survey of regularization strategies for deep models

作者:Reza Moradi, Reza Berangi, Behrouz Minaei

摘要

The most critical concern in machine learning is how to make an algorithm that performs well both on training data and new data. No free lunch theorem implies that each specific task needs its own tailored machine learning algorithm to be designed. A set of strategies and preferences are built into learning machines to tune them for the problem at hand. These strategies and preferences, with the core concern of generalization improvement, are collectively known as regularization. In deep learning, because of a considerable number of parameters, a great many forms of regularization methods are available to the deep learning community. Developing more effective regularization strategies has been the subject of significant research efforts in recent years. However, it is difficult for developers to choose the most suitable strategy for their problem at hand, because there is no comparative study regarding the performance of different strategies. In this paper, at the first step, the most effective regularization methods and their variants are presented and analyzed in a systematic approach. At the second step, comparative research on regularization techniques is presented in which the testing errors and computational costs are evaluated in a convolutional neural network, using CIFAR-10 (https://www.cs.toronto.edu/~kriz/cifar.html) dataset. In the end, different regularization methods are compared in terms of accuracy of the network, the number of epochs for the network to be trained and the number of operations per input sample. Also, the results are discussed and interpreted based on the employed strategy. The experiment results showed that weight decay and data augmentation regularizations have little computational side effects so can be used in most applications. In the case of enough computational resources, Dropout family methods are rational to be used. Moreover, in the case of abundant computational resources, batch normalization family and ensemble methods are reasonable strategies to be employed.

论文关键词:Deep learning, Convolutional neural network, Regularization, Overfitting, Generalization

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10462-019-09784-7