News

This paper aims to explore seven commonly used optimization algorithms in deep learning: SGD, Momentum-SGD, NAG, AdaGrad, RMSprop, AdaDelta, and Adam. Based on an overview of their theories and ...
Natural Gradient Descent (NGD) is a second-order neural network training that preconditions the gradient descent with the inverse of the Fisher Information Matrix (FIM). Although NGD provides an ...