Understanding the loss landscape of one-hidden-layer ReLU networks
作者:
Highlights:
•
摘要
In this paper, it is proved that for one-hidden-layer ReLU networks all differentiable local minima are global inside each differentiable region. Necessary and sufficient conditions for the existences of differentiable local minima, saddle points and non-differentiable local minima are given, as well as their locations if they do exist. Building upon the theory, a linear programming based algorithm is designed to judge the existence of differentiable local minima, and is used to predict whether spurious local minima exist for the MNIST and CIFAR-10 datasets. Experimental results show that there are no spurious local minima for most typical weight vectors. These theoretical predictions are verified by demonstrating the consistency between them and the results of gradient descent search.
论文关键词:Deep learning theory,ReLU networks,Loss landscape,Local minima,Saddle points
论文评审过程:Received 11 November 2020, Revised 12 January 2021, Accepted 2 March 2021, Available online 5 March 2021, Version of Record 12 March 2021.
论文官网地址:https://doi.org/10.1016/j.knosys.2021.106923