Distributed Methods

Tofu is an intra-layer model parallel system that partitions very large DNN models across multiple GPU devices to reduce per-GPU memory footprint. Tofu is designed to partition a dataflow graph of fine-grained tensor operators used by platforms like MXNet and TensorFlow. To optimally partition different operators in a dataflow graph, Tofu uses a recursive search algorithm that minimizes the total communication cost.

Source: Supporting Very Large Models using Automatic Dataflow Graph Partitioning

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Computational Efficiency 1 16.67%
Image Generation 1 16.67%
Recommendation Systems 1 16.67%
Federated Learning 1 16.67%
3D Reconstruction 1 16.67%
graph partitioning 1 16.67%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories