Stochastic Optimization

Local SGD

Introduced by Stich in Local SGD Converges Fast and Communicates Little

Local SGD is a distributed training technique that runs SGD independently in parallel on different workers and averages the sequences only once in a while.

Source: Local SGD Converges Fast and Communicates Little

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories