no code implementations • 3 May 2023 • Timothy Castiglia, Yi Zhou, Shiqiang Wang, Swanand Kadhe, Nathalie Baracaldo, Stacy Patterson
As part of the training, the parties wish to remove unimportant features in the system to improve generalization, efficiency, and explainability.
no code implementations • 16 Jun 2022 • Timothy Castiglia, Anirban Das, Shiqiang Wang, Stacy Patterson
Our work provides the first theoretical analysis of the effect message compression has on distributed training over vertically partitioned data.
no code implementations • 19 Aug 2021 • Anirban Das, Timothy Castiglia, Shiqiang Wang, Stacy Patterson
Each silo contains a hub and a set of clients, with the silo's vertical data shard partitioned horizontally across its clients.
no code implementations • ICLR 2021 • Timothy Castiglia, Anirban Das, Stacy Patterson
We propose Multi-Level Local SGD, a distributed stochastic gradient method for learning a smooth, non-convex objective in a multi-level communication network with heterogeneous workers.
1 code implementation • 27 Jul 2020 • Timothy Castiglia, Anirban Das, Stacy Patterson
In our algorithm, sub-networks execute a distributed SGD algorithm, using a hub-and-spoke paradigm, and the hubs periodically average their models with neighboring hubs.