Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compiler error due to unavailability of some CUDA methods #270

Merged
merged 1 commit into from
Jul 16, 2024

Conversation

sebhtml
Copy link
Contributor

@sebhtml sebhtml commented Jul 7, 2024

Problem

Both cuMemAdvise_v2 and cuMemPrefetchAsync_v2 were added in CUDA toolkit 12.2, but cudarc tries to use them for all CUDA toolkit versions.

GitHub issue

#269

Analysis

Using the NVIDIA documentation, I could not find in which CUDA toolkit version the functions cuMemAdvise_v2 and cuMemPrefetchAsync_v2 were added.

I found the information on the AMD web site here: https://rocm.docs.amd.com/projects/HIPIFY/en/latest/tables/CUDA_Driver_API_functions_supported_by_HIP.html

According to this web page, both cuMemAdvise_v2 and cuMemPrefetchAsync_v2 were added in CUDA toolkit 12.2.

Solution

Use these functions only with CUDA toolkit 12.2 or later.

NVIDIA GPU model

sebhtml@legion:~/projects/cudarc$ nvidia-smi
Sat Jul  6 21:40:58 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4060 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   40C    P3              N/A /  60W |      8MiB /  8188MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      3015      G   /usr/lib/xorg/Xorg                            4MiB |
+---------------------------------------------------------------------------------------+

CUDA toolkit version

sebhtml@legion:~/projects/cudarc$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0

Linux kernel version

sebhtml@legion:~/projects/cudarc$ uname -a
Linux legion 6.8.0-36-generic #36-Ubuntu SMP PREEMPT_DYNAMIC Mon Jun 10 10:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Operating System version

sebhtml@legion:~/projects/cudarc$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 24.04 LTS
Release:	24.04
Codename:	noble

cudarc tests

sebhtml@legion:~/projects/cudarc$ cargo test --release --features cuda-12000
    Finished `release` profile [optimized] target(s) in 0.02s
     Running unittests src/lib.rs (target/release/deps/cudarc-93da303d10d99a7b)

running 158 tests
test cublas::sys::sys_12000::bindgen_test_layout_double2 ... ok
test cublas::sys::sys_12000::bindgen_test_layout_float2 ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout__IO_FILE ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatmulAlgo_t ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatrixTransformDescOpaque_t ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatmulDescOpaque_t ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatmulHeuristicResult_t ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatmulPreferenceOpaque_t ... ok
test cublaslt::sys::sys_12000::bindgen_test_layout_cublasLtMatrixLayoutOpaque_t ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnAlgorithmUnionStruct ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnAlgorithmUnionStruct_Algorithm ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnConvolutionBwdDataAlgoPerfStruct ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnConvolutionBwdFilterAlgoPerfStruct ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnConvolutionFwdAlgoPerfStruct ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnDebugStruct ... ok
test cudnn::sys::sys_12000::bindgen_test_layout_cudnnFractionStruct ... ok
test curand::safe::tests::test_rc_counts ... ok
test curand::safe::tests::test_normal_f64 ... ok
test curand::safe::tests::test_log_normal_f64 ... ok
test curand::safe::tests::test_log_normal_f32 ... ok
test curand::safe::tests::test_different_seeds_neq ... ok
test curand::safe::tests::test_seed_reproducible ... ok
test curand::safe::tests::test_normal_f32 ... ok
test cublaslt::safe::tests::test_matmul_f32 ... ok
test cublas::safe::tests::test_dgemm ... ok
test driver::safe::alloc::tests::test_copy_uses_correct_context ... ok
test driver::safe::alloc::tests::test_post_alloc_memory ... ignored, must be executed by itself
test driver::safe::alloc::tests::test_leak_and_upgrade ... ok
test driver::safe::alloc::tests::test_post_build_arc_count ... ok
test driver::safe::alloc::tests::test_post_clone_counts ... ok
test driver::safe::alloc::tests::test_post_clone_arc_slice_counts ... ok
test cudnn::safe::tests::test_create_descriptors ... ok
test driver::safe::alloc::tests::test_slice_is_freed_with_correct_context ... ok
test cublas::safe::tests::test_sgemm ... ok
test driver::safe::alloc::tests::test_post_alloc_arc_counts ... ok
test driver::safe::core::tests::test_bounds_helper ... ok
test driver::safe::core::tests::test_transmutes ... ok
test driver::safe::alloc::tests::test_post_take_arc_counts ... ok
test driver::safe::alloc::tests::test_device_copy_to_views ... ok
test driver::safe::alloc::tests::test_post_release_counts ... ok
test cublas::safe::tests::test_sgemv ... ok
test curand::safe::tests::test_uniform_f64 ... ok
test curand::safe::tests::test_uniform_f32 ... ok
test curand::safe::tests::test_set_offset ... ok
test cublas::safe::tests::test_dgemv ... ok
test curand::safe::tests::test_uniform_u32 ... ok
test driver::safe::launch::tests::test_mut_into_kernel_param_no_inc_rc ... ok
test driver::safe::launch::tests::test_ref_into_kernel_param_inc_rc ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_ARRAY3D_DESCRIPTOR_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_ARRAY_DESCRIPTOR_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_ARRAY_SPARSE_PROPERTIES_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_ARRAY_MEMORY_REQUIREMENTS_st ... ok
test driver::safe::threading::tests::test_threading ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_ARRAY_SPARSE_PROPERTIES_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_BATCH_MEM_OP_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_MEMORY_BUFFER_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_MEMORY_HANDLE_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_MEMORY_HANDLE_DESC_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_MEMORY_HANDLE_DESC_st__bindgen_ty_1__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_MEMORY_MIPMAPPED_ARRAY_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_HANDLE_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_HANDLE_DESC_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_HANDLE_DESC_st__bindgen_ty_1__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS_st__bindgen_ty_1__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS_st__bindgen_ty_1__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_SIGNAL_PARAMS_st__bindgen_ty_1__bindgen_ty_3 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS_st__bindgen_ty_1__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS_st__bindgen_ty_1__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXTERNAL_SEMAPHORE_WAIT_PARAMS_st__bindgen_ty_1__bindgen_ty_3 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXT_SEM_SIGNAL_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_GRAPH_INSTANTIATE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_EXT_SEM_WAIT_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_HOST_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_KERNEL_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_KERNEL_NODE_PARAMS_v2_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_LAUNCH_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_MEMCPY2D_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_MEMCPY3D_PEER_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_MEMCPY3D_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_MEMSET_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_MEM_ALLOC_NODE_PARAMS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_POINTER_ATTRIBUTE_P2P_TOKENS_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1__bindgen_ty_3 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1__bindgen_ty_4 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_DESC_st__bindgen_ty_1__bindgen_ty_5 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_RESOURCE_VIEW_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUDA_TEXTURE_DESC_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUaccessPolicyWindow_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st__bindgen_ty_2__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st__bindgen_ty_2__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUarrayMapInfo_st__bindgen_ty_3 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUdevprop_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUexecAffinityParam_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUexecAffinityParam_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUexecAffinitySmCount_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUgraphExecUpdateResultInfo_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUipcEventHandle_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUipcMemHandle_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchAttributeValue_union ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchAttributeValue_union__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchAttributeValue_union__bindgen_ty_2 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchAttribute_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchConfig_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlaunchMemSyncDomainMap_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUlibraryHostUniversalFunctionAndDataTable_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemAccessDesc_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemAllocationProp_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemAllocationProp_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemLocation_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemPoolProps_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUmemPoolPtrExportData_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpFlushRemoteWritesParams_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpMemoryBarrierParams_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpWaitValueParams_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpWaitValueParams_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpWriteValueParams_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUstreamBatchMemOpParams_union_CUstreamMemOpWriteValueParams_st__bindgen_ty_1 ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUtensorMap_st ... ok
test driver::sys::sys_12000::bindgen_test_layout_CUuuid_st ... ok
test cudnn::safe::tests::test_reduction ... ok
test nccl::sys::sys_12000::bindgen_test_layout_ncclConfig_v21400 ... ok
test nccl::sys::sys_12000::bindgen_test_layout_ncclUniqueId ... ok
test driver::safe::launch::tests::test_launch_with_32bit ... ok
test cudnn::safe::tests::test_conv1d ... ok
test cudnn::safe::tests::test_conv2d_pick_algorithms ... FAILED
test driver::safe::launch::tests::test_launch_with_16bit ... ok
test driver::safe::launch::tests::test_large_launches ... ok
test driver::safe::launch::tests::test_launch_with_64bit ... ok
test nvrtc::safe::tests::test_compile_options_build_ftz ... ok
test nvrtc::safe::tests::test_compile_options_build_multi ... ok
test nvrtc::safe::tests::test_compile_options_build_none ... ok
test driver::safe::launch::tests::test_launch_with_8bit ... ok
test driver::safe::launch::tests::test_launch_with_floats ... ok
test driver::safe::launch::tests::test_launch_with_mut_and_ref_cudarc ... ok
test driver::safe::launch::tests::test_launch_with_views ... ok
test nvrtc::result::tests::test_compile_bad_program ... ok
test nvrtc::result::tests::test_compile_program_1_opt ... ok
test nvrtc::result::tests::test_compile_program_2_opt ... ok
test nvrtc::result::tests::test_compile_program_no_opts ... ok
test nvrtc::result::tests::test_get_ptx ... ok
test nvrtc::safe::tests::test_compile_no_opts ... ok
test driver::safe::launch::tests::test_par_launch ... ok
test cudnn::safe::tests::test_conv3d ... ok
test nccl::result::tests::multi_thread ... ok
test nccl::result::tests::single_thread ... ok
test nccl::safe::tests::test_all_reduce ... ok

failures:

---- cudnn::safe::tests::test_conv2d_pick_algorithms stdout ----
thread 'cudnn::safe::tests::test_conv2d_pick_algorithms' panicked at src/cudnn/safe/mod.rs:113:13:
assertion `left == right` failed
  left: CUDNN_CONVOLUTION_BWD_DATA_ALGO_WINOGRAD
 right: CUDNN_CONVOLUTION_BWD_DATA_ALGO_1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    cudnn::safe::tests::test_conv2d_pick_algorithms

test result: FAILED. 156 passed; 1 failed; 1 ignored; 0 measured; 0 filtered out; finished in 3.15s

error: test failed, to rerun pass `--lib`

Copy link
Owner

@coreylowman coreylowman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants