News

In this paper, we propose a simple variant of the original SVRG, called variance reduced stochastic gradient descent (VR-SGD). Unlike the choices of snapshot and starting points in SVRG and its ...
The first chapter of Neural Networks, Tricks of the Trade strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of a more general technique ...
Distributed stochastic gradient descent (SGD) has attracted considerable recent attention due to its potential for scaling computational resources, reducing training time, and helping protect user ...