Skip to content

Commit

Permalink
Merge pull request #485 from ttakayuki/202407-nvidia-update
Browse files Browse the repository at this point in the history
2024-07 NVIDIA Library Update
  • Loading branch information
ttakayuki authored Aug 23, 2024
2 parents 1102ada + 0f6cbeb commit f41a2f8
Show file tree
Hide file tree
Showing 6 changed files with 42 additions and 6 deletions.
5 changes: 5 additions & 0 deletions en/docs/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ The following is a list of CUDA Toolkit, cuDNN, and NCCL that can be used with t
| cuda/12.4 | 12.4.0 | Yes | Yes | Yes |
| cuda/12.4 | 12.4.1 | Yes | Yes | Yes |
| cuda/12.5 | 12.5.0 | Yes | Yes | Yes |
| cuda/12.5 | 12.5.1 | Yes | Yes | Yes |

[^1]: Provided only for experimental use. Rocky Linux 8.6 is supported with CUDA 11.7.1 or later.

Expand All @@ -49,6 +50,7 @@ Compute Node (V):
| 8.9.7 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.0.0[^2] | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.1.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.2.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

Compute Node (A):

Expand All @@ -63,6 +65,7 @@ Compute Node (A):
| 8.9.7 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.0.0[^2] | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.1.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.2.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

[^2]: We have confirmed that when cuDNN 9.0.0 is used with CUDA 11.0 to CUDA 11.3, an error occurs when calling the `cudnnRNNBackwardWeights_v8` function.

Expand All @@ -84,6 +87,7 @@ Compute Node (V):
| 2.19.3-1 | - | - | - | - | - | Yes | Yes | - | - |
| 2.20.5-1 | - | - | - | - | - | Yes | - | Yes | - |
| 2.21.5-1 | - | - | - | - | - | Yes | - | Yes | Yes |
| 2.22.3-1 | - | - | - | - | - | Yes | - | Yes | Yes |

Compute Node (A):

Expand All @@ -101,6 +105,7 @@ Compute Node (A):
| 2.19.3-1 | - | - | - | - | - | Yes | Yes | - | - |
| 2.20.5-1 | - | - | - | - | - | Yes | - | Yes | - |
| 2.21.5-1 | - | - | - | - | - | Yes | - | Yes | Yes |
| 2.22.3-1 | - | - | - | - | - | Yes | - | Yes | Yes |

## GDRCopy

Expand Down
6 changes: 3 additions & 3 deletions en/docs/system-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ The software available on the ABCI system is shown below.
| OS | Rocky Linux | 8.6 | - |
| OS | Red Hat Enterprise Linux | - | 8.2 |
| Job Scheduler | Altair Grid Engine | 8.6.19_C121_1 | 8.6.19_C121_1 |
| Development Environment | [CUDA Toolkit](gpu.md#cuda-toolkit) | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0 | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0 |
| Development Environment | [CUDA Toolkit](gpu.md#cuda-toolkit) | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0<br>12.5.1 | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0<br>12.5.1 |
| | Intel oneAPI<br>(compilers and libraries) | 2024.0.2 | 2024.0.2 |
| | Intel VTune | 2024.0.0 | 2024.0.0 |
| | Intel Trace Analyzer and Collector | 2022.0 | 2022.0 |
Expand All @@ -181,8 +181,8 @@ The software available on the ABCI system is shown below.
| Container | [SingularityPRO](containers.md#singularity) | 4.1.2-2 | 4.1.2-2 |
| | Singularity Endpoint | 2.3.0 | 2.3.0 |
| MPI | [Intel MPI](mpi.md#intel-mpi) | 2021.11 | 2021.11 |
| Library | [cuDNN](gpu.md#cudnn) | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1 | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1 |
| | [NCCL](gpu.md#nccl) | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1 | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1 |
| Library | [cuDNN](gpu.md#cudnn) | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1<br>9.2.1 | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1<br>9.2.1 |
| | [NCCL](gpu.md#nccl) | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1<br>2.22.3-1 | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1<br>2.22.3-1 |
| | gdrcopy | 2.4.1 | 2.4.1 |
| | UCX | 1.10 | 1.11 |
| | libfabric | 1.7.0-1 | 1.9.0rc1-1 |
Expand Down
13 changes: 13 additions & 0 deletions en/docs/system-updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,19 @@
| Update | openjdk | 11.0.24.0.8 | 11.0.22.0.7 |
| Update | openjdk | 17.0.12.0.7 | 17.0.10.0.7 |

## 2024-08-08

| Add / Update / Delete | Software | Version | Previous version |
|:--|:--|:--|:--|
| Add | nccl | 2.22.3-1 | |

## 2024-07-31

| Add / Update / Delete | Software | Version | Previous version |
|:--|:--|:--|:--|
| Add | cuda | 12.5.1 | |
| Add | cudnn | 9.2.1 | |

## 2024-06-28

* The specific group area (/projects) is no longer available.
Expand Down
5 changes: 5 additions & 0 deletions ja/docs/gpu.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ ABCIシステムでは、NVIDIAが提供する以下のライブラリが利用
| cuda/12.4 | 12.4.0 | Yes | Yes | Yes |
| cuda/12.4 | 12.4.1 | Yes | Yes | Yes |
| cuda/12.5 | 12.5.0 | Yes | Yes | Yes |
| cuda/12.5 | 12.5.1 | Yes | Yes | Yes |

[^1]: 試験用に提供しています。Rocky Linux 8.6は、CUDA 11.7.1以降でサポートされます。

Expand All @@ -49,6 +50,7 @@ ABCIシステムでは、NVIDIAが提供する以下のライブラリが利用
| 8.9.7 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.0.0[^2] | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.1.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.2.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

計算ノード(A):

Expand All @@ -63,6 +65,7 @@ ABCIシステムでは、NVIDIAが提供する以下のライブラリが利用
| 8.9.7 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.0.0[^2] | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.1.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
| 9.2.1 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

[^2]: cuDNN 9.0.0をCUDA 11.0から11.3で使用した場合、`cudnnRNNBackwardWeights_v8`関数呼び出し時にエラーが発生することを確認しています。

Expand All @@ -84,6 +87,7 @@ ABCIシステムでは、NVIDIAが提供する以下のライブラリが利用
| 2.19.3-1 | - | - | - | - | - | Yes | Yes | - | - |
| 2.20.5-1 | - | - | - | - | - | Yes | - | Yes | - |
| 2.21.5-1 | - | - | - | - | - | Yes | - | Yes | Yes |
| 2.22.3-1 | - | - | - | - | - | Yes | - | Yes | Yes |

計算ノード(A):

Expand All @@ -101,6 +105,7 @@ ABCIシステムでは、NVIDIAが提供する以下のライブラリが利用
| 2.19.3-1 | - | - | - | - | - | Yes | Yes | - | - |
| 2.20.5-1 | - | - | - | - | - | Yes | - | Yes | - |
| 2.21.5-1 | - | - | - | - | - | Yes | - | Yes | Yes |
| 2.22.3-1 | - | - | - | - | - | Yes | - | Yes | Yes |

## GDRCopy

Expand Down
6 changes: 3 additions & 3 deletions ja/docs/system-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ ABCIシステムで利用可能なソフトウェア一覧を以下に示しま
| OS | Rocky Linux | 8.6 | - |
| OS | Red Hat Enterprise Linux | - | 8.2 |
| Job Scheduler | Altair Grid Engine | 8.6.19_C121_1 | 8.6.19_C121_1 |
| Development Environment | [CUDA Toolkit](gpu.md#cuda-toolkit) | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0 | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0 |
| Development Environment | [CUDA Toolkit](gpu.md#cuda-toolkit) | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0<br>12.5.1 | 11.2.2<br>11.6.2<br>11.7.1<br>11.8.0<br>12.1.1<br>12.2.0<br>12.3.2<br>12.4.0<br>12.4.1<br>12.5.0<br>12.5.1 |
| | Intel oneAPI<br>(compilers and libraries) | 2024.0.2 | 2024.0.2 |
| | Intel VTune | 2024.0.0 | 2024.0.0 |
| | Intel Trace Analyzer and Collector | 2022.0 | 2022.0 |
Expand All @@ -179,8 +179,8 @@ ABCIシステムで利用可能なソフトウェア一覧を以下に示しま
| Container | [SingularityPRO](containers.md#singularity) | 4.1.2-2 | 4.1.2-2 |
| | Singularity Endpoint | 2.3.0 | 2.3.0 |
| MPI | [Intel MPI](mpi.md#intel-mpi) | 2021.11 | 2021.11 |
| Library | [cuDNN](gpu.md#cudnn) | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1 | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1 |
| | [NCCL](gpu.md#nccl) | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1 | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1 |
| Library | [cuDNN](gpu.md#cudnn) | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1<br>9.2.1 | 8.1.1<br>8.3.3<br>8.4.1<br>8.6.0<br>8.7.0<br>8.8.1<br>8.9.7<br>9.0.0<br>9.1.1<br>9.2.1 |
| | [NCCL](gpu.md#nccl) | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1<br>2.22.3-1 | 2.8.4-1<br>2.11.4-1<br>2.12.12-1<br>2.13.4-1<br>2.14.3-1<br>2.15.5-1<br>2.16.2-1<br>2.17.1-1<br>2.18.5-1<br>2.19.3-1<br>2.20.5-1<br>2.21.5-1<br>2.22.3-1 |
| | gdrcopy | 2.4.1 | 2.4.1 |
| | UCX | 1.10 | 1.11 |
| | libfabric | 1.7.0-1 | 1.9.0rc1-1 |
Expand Down
13 changes: 13 additions & 0 deletions ja/docs/system-updates.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,19 @@
| Update | openjdk | 11.0.24.0.8 | 11.0.22.0.7 |
| Update | openjdk | 17.0.12.0.7 | 17.0.10.0.7 |

## 2024-08-08

| Add / Update / Delete | Software | Version | Previous version |
|:--|:--|:--|:--|
| Add | nccl | 2.22.3-1 | |

## 2024-07-31

| Add / Update / Delete | Software | Version | Previous version |
|:--|:--|:--|:--|
| Add | cuda | 12.5.1 | |
| Add | cudnn | 9.2.1 | |

## 2024-06-28

* 特定グループ領域(/projects)のサービス提供を終了いたしました。
Expand Down

0 comments on commit f41a2f8

Please sign in to comment.