Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of using RDNA3 Matrix Multiplication Instructions to Speed up on AMD GPU #566

Open
fancyIX opened this issue Dec 17, 2024 · 3 comments

Comments

@fancyIX
Copy link

fancyIX commented Dec 17, 2024

AMD's RDNA 3 architecture introduces matrix cores with specialized instructions for accelerating matrix multiplication.

Details here: https://gpuopen.com/learn/wmma_on_rdna3/

@fancyIX
Copy link
Author

fancyIX commented Dec 17, 2024

Wave Matrix Multiply-Accumulate (WMMA) instructions provide acceleration for common matrix arithmetic
operations. The instructions are encoded using the VOP3P encoding.

@fancyIX
Copy link
Author

fancyIX commented Dec 18, 2024

I tried the clang that comes with amd official driver. Seems like it can compile opencl code which calls __builtin_amdgcn_wmma_f32_16x16x16_f16_w32 etc.

@CNugteren
Copy link
Owner

Sounds interesting. I myself (the maintainer of this project) am not able to work on this project, but I'm happy to help people that are willing to in the right direction, and review & accept pull requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants