You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
When running the 'make alltuners' on a Mali GPU, some tunes run hours long. And finally it stuck there and never return. Are there any methods to speed up?
The text was updated successfully, but these errors were encountered:
First of all, you could modify the tuner's file, e.g. CLBlast/src/tuning/kernels/xgemm.hpp and reduce the number of parameters in settings.parameters in multiple places, e.g. change {16, 32, 64} into {16, 32} for example.
Secondly, you could change the --fraction command-line argument (of e.g. clblast_tuner_xgemm) to something below 1.0 to not test everything.
Thirdly, you could tune only for the precision you need, e.g. single-precision (32) float only, and skip the other tuners. Basically make alltuners first compiles everything and then runs all the tuners (e.g. ./clblast_tuner_xgemm --precision 32) for all precisions after each other.
Lastly, for GEMM specifically there are 4 parts being tuned (from CLBlast/src/tuning/kernels/xgemm.cpp):
printf("* (1/4) Tuning main GEMM kernel (GEMMK == 0) for fixed set of parameters\n\n");
StartVariation<1>(argc, argv);
printf("* (2/4) Tuning main GEMM kernel (GEMMK == 0) for random parameters out of larger set\n\n");
StartVariation<2>(argc, argv);
printf("* (3/4) Tuning secondary GEMM kernel (GEMMK == 1) for fixed set of parameters\n\n");
StartVariation<11>(argc, argv);
printf("* (4/4) Tuning secondary GEMM kernel (GEMMK == 1) for random parameters out of larger set\n\n");
StartVariation<12>(argc, argv);
Hi,
When running the 'make alltuners' on a Mali GPU, some tunes run hours long. And finally it stuck there and never return. Are there any methods to speed up?
The text was updated successfully, but these errors were encountered: