Reasoning about errors in Triton Kernels #1991
Unanswered
JordanLazzaro
asked this question in
Q&A
Replies: 1 comment 1 reply
-
This is a big problem we are aware of. We have some very aggressive optimization pass that can break easily for very complex inner loops. This will disappear after an optimizer refactor that we plan in the next few months. Fortunately, for simpler programs that don't have multiple |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've been attempting to write a from-scratch implementation of FlashAttention in Triton, and find that I'm basically writing things line by line as I try to guess and check what I need to change to make things not error or hang. This has been somewhat demoralizing, as I don't know of a good systematic way to diagnose what may be going wrong at various points. Currently, I'm running kernels in a jupyter notebook on an A10 GPU (only one that's ever available on Lambda Cloud) so perhaps this is a source of some of my issues, as I know a lot of things are tested on A100s. But as a whole, what are some tools and techniques that work well to be more direct in diagnosing issues with kernels written in Triton?
Beta Was this translation helpful? Give feedback.
All reactions