LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence
作者:Rahul Yedida, Snehanshu Saha, Tejas Prashanth
摘要
We present a novel theoretical framework for computing large, adaptive learning rates. Our framework makes minimal assumptions on the activations used and exploits the functional properties of the loss function. Specifically, we show that the inverse of the Lipschitz constant of the loss function is an ideal learning rate. We analytically compute formulas for the Lipschitz constant of several loss functions, and through extensive experimentation, demonstrate the strength of our approach using several architectures and datasets. In addition, we detail the computation of learning rates when other optimizers, namely, SGD with momentum, RMSprop, and Adam, are used. Compared to standard choices of learning rates, our approach converges faster, and yields better results.
论文关键词:Lipschitz constant, Adaptive learning, Machine learning, Deep learning
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-020-01892-0