TritonBench is a collection of PyTorch operators used to evaluation the performance of Triton, and its integration with PyTorch.
The benchmark suite should be self-contained of its dependencies. To install, follow the steps below.
Step 1: clone the repository and checkout all submodules
$ git clone https://github.com/pytorch-labs/tritonbench.git
$ git submodule update --init --recursive
Step 2: run install.py
$ python install.py
By default, it will install the latest PyTorch nightly release and use the Triton version bundled with it.
To benchmark an operator, run the following command:
$ python run.py --op gemm
To install as a library:
$ pip install -e .
# in your own benchmark script
import tritonbench
from tritonbench.utils import parser
op_args = parser.parse_args()
addmm_bench = tritonbench.load_opbench_by_name("addmm")(op_args)
addmm_bench.run()
We depend on the following projects as a source of customized Triton or CUTLASS kernels:
- (CUDA, HIP) kernels
- (CUDA, HIP) generative-recommenders
- (CUDA, HIP) Liger-Kernel
- (CUDA) xformers
- (CUDA) flash-attention
- (CUDA) FBGEMM
- (CUDA) ThunderKittens
- (CUDA) cutlass-kernels
TritonBench is BSD 3-Clause licensed, as found in the LICENSE file.