# D$^2$: Decentralized Training over Decentralized Data

19 Mar 2018Hanlin TangXiangru LianMing YanCe ZhangJi Liu

While training a machine learning model using multiple workers, each of which collects data from their own data sources, it would be most useful when the data collected from different workers can be {\em unique} and {\em different}. Ironically, recent analysis of decentralized parallel stochastic gradient descent (D-PSGD) relies on the assumption that the data hosted on different workers are {\em not too different}... (read more)

