Reasoning about errors in Triton Kernels #1991

JordanLazzaro · 2023-07-26T16:34:44Z

JordanLazzaro
Jul 26, 2023

I've been attempting to write a from-scratch implementation of FlashAttention in Triton, and find that I'm basically writing things line by line as I try to guess and check what I need to change to make things not error or hang. This has been somewhat demoralizing, as I don't know of a good systematic way to diagnose what may be going wrong at various points. Currently, I'm running kernels in a jupyter notebook on an A10 GPU (only one that's ever available on Lambda Cloud) so perhaps this is a source of some of my issues, as I know a lot of things are tested on A100s. But as a whole, what are some tools and techniques that work well to be more direct in diagnosing issues with kernels written in Triton?

ptillet · 2023-07-26T16:57:59Z

ptillet
Jul 26, 2023
Maintainer

This is a big problem we are aware of. We have some very aggressive optimization pass that can break easily for very complex inner loops. This will disappear after an optimizer refactor that we plan in the next few months.

Fortunately, for simpler programs that don't have multiple dot in their inner loops, things would be more stable :)

1 reply

JordanLazzaro Jul 26, 2023
Author

How did you end up figuring out what worked when implementing FlashAttention? Was there some workflow you found to work well with this issue in mind?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reasoning about errors in Triton Kernels #1991

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Reasoning about errors in Triton Kernels #1991

JordanLazzaro Jul 26, 2023

Replies: 1 comment · 1 reply

ptillet Jul 26, 2023 Maintainer

JordanLazzaro Jul 26, 2023 Author

JordanLazzaro
Jul 26, 2023

Replies: 1 comment 1 reply

ptillet
Jul 26, 2023
Maintainer

JordanLazzaro Jul 26, 2023
Author