What is the current recommended approach for data parallelism? #9508
-
About 3-4 years ago I used I don't have a single very large graph that won't fit on memory but rather I am looking to obtain graph level embeddings, and so when I batch my data that is when the memory becomes an issue and I want to try a larger batch size by splitting the batch across multiple GPU's. FWIW I will then be looking to apply some contrastive loss to the graphs, not sure if this makes a difference for what the recommended best approach would be. Thanks for any help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
I'd simply use |
Beta Was this translation helpful? Give feedback.
I'd simply use
DistributedDataParallel
for the use case. An example is here: https://github.com/pyg-team/pytorch_geometric/blob/fbafbc4fc9181e8759ec1f39d9618992793b5fe1/examples/multi_gpu/distributed_batching.py