Search Results for author: Dominik Grewe

Found 5 papers, 2 papers with code

Automatic Discovery of Composite SPMD Partitioning Strategies in PartIR

no code implementations7 Oct 2022 Sami Alabed, Dominik Grewe, Juliana Franco, Bart Chrzaszcz, Tom Natan, Tamara Norman, Norman A. Rink, Dimitrios Vytiniotis, Michael Schaarschmidt

Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm.

Automap: Towards Ergonomic Automated Parallelism for ML Models

no code implementations6 Dec 2021 Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov

The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism.

Synthesizing Optimal Parallelism Placement and Reduction Strategies on Hierarchical Systems for Deep Learning

no code implementations20 Oct 2021 Ningning Xie, Tamara Norman, Dominik Grewe, Dimitrios Vytiniotis

We present a novel characterization of the mapping of multiple parallelism forms (e. g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware mapping.

Program Synthesis

TF-Replicator: Distributed Machine Learning for Researchers

1 code implementation1 Feb 2019 Peter Buchlovsky, David Budden, Dominik Grewe, Chris Jones, John Aslanides, Frederic Besse, Andy Brock, Aidan Clark, Sergio Gómez Colmenarejo, Aedan Pope, Fabio Viola, Dan Belov

We describe TF-Replicator, a framework for distributed machine learning designed for DeepMind researchers and implemented as an abstraction over TensorFlow.

BIG-bench Machine Learning Continuous Control +1

Cannot find the paper you are looking for? You can Submit a new open access paper.