Skip to content

Commit

Permalink
Merge pull request #473 from ttakayuki/202405-update-sharp-plugins
Browse files Browse the repository at this point in the history
Updated the SHARP contents.
  • Loading branch information
ttakayuki authored May 17, 2024
2 parents f5d1d8d + 05e8f29 commit 44b66ed
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 11 deletions.
17 changes: 11 additions & 6 deletions en/docs/tips/sharp.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@ The corresponding module of the plugin changes depending on the version of NCCL.

| NCCL-SHARP plugin module | NCCL versions |
| ---------------------------------------- | -------------------- |
| `nccl-rdma-sharp-plugins/v2.1.x-5f238fb` | 2.8、2.9、2.10、2.11 |
| `nccl-rdma-sharp-plugins/v2.1.x-5f238fb` | 2.8、2.11 |
| `nccl-rdma-sharp-plugins/v2.2.x-5e6ed3e` | 2.12 |
| `nccl-rdma-sharp-plugins/v2.5.x-4ccb98a` | 2.12、2.13、2.14、2.15、2.16、2.17、2.18、2.19 |

To use SHARP with NCCL, load the CUDA, NCCL and NCCL SHARP plugin modules and set the following environment variables:

```
[username@es-a1 ~] module load cuda/11.0 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@es-a1 ~] module load cuda/11.2 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
```

* `NCCL_COLLNET_ENABLE=1`
Expand All @@ -34,13 +35,16 @@ To use SHARP with NCCL, load the CUDA, NCCL and NCCL SHARP plugin modules and se

### Example using nccl-tests

The following is an example of enabling SHARP on NCCL using [nccl-tests](https://github.com/NVIDIA/nccl-tests)).
The following is an example of enabling SHARP on NCCL using [nccl-tests](https://github.com/NVIDIA/nccl-tests).

!!! warning
We have confirmed an issue in the `nccl-rdma-sharp-plugins/v2.5.x-4ccb98a` module where nccl-tests do not run with NCCL 2.12 through 2.16.

First, download nccl-tests, enable MPI support, and then build.

```
[username@es-a1 ~] module load openmpi/4.1.3 cuda/11.0 nccl/2.8
[username@es-a1 ~] git clone https://github.com/NVIDIA/nccl-tests.git
[username@es-a1 ~] module load hpcx/2.12 cuda/11.2 nccl/2.8
[username@es-a1 ~] git clone https://github.com/NVIDIA/nccl-tests.git -b v2.11.0
[username@es-a1 ~] cd nccl-tests
[username@es-a1 ~] make MPI=1 MPI_HOME=${OMPI_HOME} CUDA_HOME=${CUDA_HOME} NCCL_HOME=${NCCL_HOME}
```
Expand All @@ -49,9 +53,10 @@ After building, a binary will be generated under the `build` directory, so execu

```
[username@es-a1 ~] qrsh -g group -l rt_AF=2 -l h_rt=01:00:00
[username@a0000 ~] module load openmpi/4.1.3 cuda/11.0 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@a0000 ~] module load hpcx/2.12 cuda/11.2 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@a0000 ~] cd nccl-tests
[username@a0000 ~] mpirun -np 16 -map-by ppr:8:node \
-hostfile ${SGE_JOB_HOSTLIST} \
-x UCX_TLS=dc,shm,self \
-x LD_LIBRARY_PATH=${LD_LIBRARY_PATH} \
-x NCCL_COLLNET_ENABLE=1 \
Expand Down
15 changes: 10 additions & 5 deletions ja/docs/tips/sharp.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,14 @@ ABCIでは計算ノード(A)向けにNCCL-SHARPプラグインをモジュール

| NCCL SHARPプラグインモジュール | NCCLバージョン |
| ---------------------------------------- | -------------------- |
| `nccl-rdma-sharp-plugins/v2.1.x-5f238fb` | 2.8、2.9、2.10、2.11 |
| `nccl-rdma-sharp-plugins/v2.1.x-5f238fb` | 2.8、2.11 |
| `nccl-rdma-sharp-plugins/v2.2.x-5e6ed3e` | 2.12 |
| `nccl-rdma-sharp-plugins/v2.5.x-4ccb98a` | 2.12、2.13、2.14、2.15、2.16、2.17、2.18、2.19 |

NCCLでSHARPを使用するには、CUDA、NCCL、およびNCCL SHARPプラグインモジュールをロードし、次の環境変数を設定します。

```
[username@es-a1 ~] module load cuda/11.0 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@es-a1 ~] module load cuda/11.2 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
```

* `NCCL_COLLNET_ENABLE=1`
Expand All @@ -35,11 +36,14 @@ NCCLでSHARPを使用するには、CUDA、NCCL、およびNCCL SHARPプラグ

[nccl-tests](https://github.com/NVIDIA/nccl-tests)を使い、NCCLでSHARPを有効にする例を以下に示します。

!!! warning
`nccl-rdma-sharp-plugins/v2.5.x-4ccb98a`において、NCCL 2.12 ~ 2.16ではnccl-testsが動作しない問題を確認しています。

まず、nccl-testsをダウンロードし、MPIサポートを有効にした上でビルドします。

```
[username@es-a1 ~] module load openmpi/4.1.3 cuda/11.0 nccl/2.8
[username@es-a1 ~] git clone https://github.com/NVIDIA/nccl-tests.git
[username@es-a1 ~] module load hpcx/2.12 cuda/11.2 nccl/2.8
[username@es-a1 ~] git clone https://github.com/NVIDIA/nccl-tests.git -b v2.11.0
[username@es-a1 ~] cd nccl-tests
[username@es-a1 ~] make MPI=1 MPI_HOME=${OMPI_HOME} CUDA_HOME=${CUDA_HOME} NCCL_HOME=${NCCL_HOME}
```
Expand All @@ -48,9 +52,10 @@ NCCLでSHARPを使用するには、CUDA、NCCL、およびNCCL SHARPプラグ

```
[username@es-a1 ~] qrsh -g group -l rt_AF=2 -l h_rt=01:00:00
[username@a0000 ~] module load openmpi/4.1.3 cuda/11.0 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@a0000 ~] module load hpcx/2.12 cuda/11.2 nccl/2.8 nccl-rdma-sharp-plugins/v2.1.x-5f238fb
[username@a0000 ~] cd nccl-tests
[username@a0000 ~] mpirun -np 16 -map-by ppr:8:node \
-hostfile ${SGE_JOB_HOSTLIST} \
-x UCX_TLS=dc,shm,self \
-x LD_LIBRARY_PATH=${LD_LIBRARY_PATH} \
-x NCCL_COLLNET_ENABLE=1 \
Expand Down

0 comments on commit 44b66ed

Please sign in to comment.