Skip to content
This repository has been archived by the owner on Apr 28, 2023. It is now read-only.

[Cuda Codegen] Emit launch bounds #526

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

thetheodor
Copy link

Cuda functions can be annotated with launch bounds, that is the maximum
number of threads per block (the minimum blocks per multiprocessor can
also be specified). This information is used by nvrtc/nvcc during
register allocation (and probably other phases as well).

@ftynse
Copy link
Contributor

ftynse commented Jun 19, 2018

did you check what happens if somebody manually maps to .mapToThreads(32,0,0) ?

Cuda functions can be annotated with launch bounds, that is the maximum
number of threads per block (the minimum blocks per multiprocessor can
also be specified). This information is used by nvrtc/nvcc during
register allocation (and probably other phases as well).
@thetheodor
Copy link
Author

Fixed.

Copy link
Contributor

@skimo-openhub skimo-openhub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just putting a block on this because it fails for me with

[ RUN      ] TensorDot_32_512_8_2_28_28.BaseCorrect
unknown file: Failure
C++ exception with description "Error at: /home/skimo/git/c2isl/tc/core/cuda/cuda_rtc.cc:188: CUDA_ERROR_INVALID_VALUE" thrown in the test body.
[  FAILED  ] TensorDot_32_512_8_2_28_28.BaseCorrect (540 ms)

@skimo-openhub
Copy link
Contributor

skimo-openhub commented Jun 20, 2018 via email

@thetheodor
Copy link
Author

Oh, this test is failing for me as well. However, if I dump the cuda and compile it with nvcc, then I see no error.

auto b1 = block.view[1];
b1 = b1 == 0 ? 1 : b1;
auto b2 = block.view[2];
b1 = b2 == 0 ? 1 : b2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be b2 instead of b1.
However, I would suggest you remove this special handling of 0.

Copy link
Contributor

@nicolasvasilache nicolasvasilache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to unsubscribe, don't see a way other than approving

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants