Distributed Methods

DistDGL is a system for training GNNs in a mini-batch fashion on a cluster of machines. It is is based on the Deep Graph Library (DGL), a popular GNN development framework. DistDGL distributes the graph and its associated data (initial features and embeddings) across the machines and uses this distribution to derive a computational decomposition by following an owner-compute rule. DistDGL follows a synchronous training approach and allows ego-networks forming the mini-batches to include non-local nodes. To minimize the overheads associated with distributed computations, DistDGL uses a high-quality and light-weight mincut graph partitioning algorithm along with multiple balancing constraints. This allows it to reduce communication overheads and statically balance the computations. It further reduces the communication by replicating halo nodes and by using sparse embedding updates. The combination of these design choices allows DistDGL to train high-quality models while achieving high parallel efficiency and memory scalability

Source: DistDGL: Distributed Graph Neural Network Training for Billion-Scale Graphs

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
graph partitioning 3 42.86%
Recommendation Systems 1 14.29%
Graph Embedding 1 14.29%
Link Prediction 1 14.29%
Fraud Detection 1 14.29%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories