A Novel Distributed Variant of Stochastic Gradient Descent and Its Optimization

Yi-qi WANG, Ya-wei ZHAO, Zhan SHI, Jian-ping YIN

Abstract


In the age of big data, large-scale learning problems become increasingly significant. Distributed machine learning algorithms thus draw a lot of interest, particularly those based on Stochastic Gradient Descent (SGD) with variance reduction technique. In this paper, we propose and implement a distributed programming strategy for a newly developed variance-reducing SGDbased algorithm, and analyze its performance with various parameter. Moreover, a new SGD-based algorithm named BATCHVR is introduced, which computes the full-gradients required by SGD in each stage using batches in an incremental manner. Experiments on the HPC cluster, i.e. TH-1A demonstrate the effectiveness of the distributed strategy and the excellent performance of the proposed algorithm.

Keywords


SGD; variance reduction; large-scale machine learning


DOI
10.12783/dtcse/aics2016/8245

Refbacks

  • There are currently no refbacks.