You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Wave Matrix Multiply-Accumulate (WMMA) instructions provide acceleration for common matrix arithmetic
operations. The instructions are encoded using the VOP3P encoding.
I tried the clang that comes with amd official driver. Seems like it can compile opencl code which calls __builtin_amdgcn_wmma_f32_16x16x16_f16_w32 etc.
Sounds interesting. I myself (the maintainer of this project) am not able to work on this project, but I'm happy to help people that are willing to in the right direction, and review & accept pull requests.
AMD's RDNA 3 architecture introduces matrix cores with specialized instructions for accelerating matrix multiplication.
Details here: https://gpuopen.com/learn/wmma_on_rdna3/
The text was updated successfully, but these errors were encountered: