Distributed Methods

PipeDream-2BW

Introduced by Narayanan et al. in Memory-Efficient Pipeline-Parallel DNN Training

PipeDream-2BW is an asynchronous pipeline parallel method that supports memory-efficient pipeline parallelism, a hybrid form of parallelism that combines data and model parallelism with input pipelining. PipeDream-2BW uses a novel pipelining and weight gradient coalescing strategy, combined with the double buffering of weights, to ensure high throughput, low memory footprint, and weight update semantics similar to data parallelism. In addition, PipeDream2BW automatically partitions the model over the available hardware resources, while respecting hardware constraints such as memory capacities of accelerators, and topologies and bandwidths of interconnects. PipeDream-2BW also determines when to employ existing memory-savings techniques, such as activation recomputation, that trade off extra computation for lower memory footprint.

The two main features are a double-buffered weight update (2BW) and flush mechanisms ensure high throughput. PipeDream-2BW splits models into stages over multiple workers, and each stage is replicated an equal number of times (with data-parallel updates across replicas of the same stage). Such parallel pipelines work well for models where each layer is repeated a fixed number of times (e.g., transformer models).

Source: Memory-Efficient Pipeline-Parallel DNN Training

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Image Classification 1 33.33%
Machine Translation 1 33.33%
Sentiment Analysis 1 33.33%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories