Learning flat representations with artificial neural networks
作者:Vlad Constantinescu, Costin Chiru, Tudor Boloni, Adina Florea, Robi Tacutu
摘要
In this paper, we propose a method of learning representation layers with squashing activation functions within a deep artificial neural network which directly addresses the vanishing gradients problem. The proposed solution is derived from solving the maximum likelihood estimator for components of the posterior representation, which are approximately Beta-distributed, formulated in the context of variational inference. This approach not only improves the performance of deep neural networks with squashing activation functions on some of the hidden layers - including in discriminative learning - but can be employed towards producing sparse codes.
论文关键词:Learning representations, Infomax, Beta distribution, Vanishing gradients
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10489-020-02032-4