You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first I have to thank for sharing such a fine profiling library!
Currently I am tracking down some strange performance degradation issues in MFEM with OpenMP+MPI based builds. Since my experience with profiling MFEM through Caliper was good for now and I was able to track down issues in the past pretty quickly with this suite, I decided to use this setup again. However, I run into troubles when using OpenMP+MPI (OpenMPI 5) in Caliper with LLVM 13.0.1.
My first issue is that with all configurations test 4 fails (v.2.6 and master). Here the tail:
395 │ 4/4 Testing: CI_app_tests
396 │ 4/4 Test: CI_app_tests
397 │ Command: "/usr/bin/python3.10""-B""-m""unittest""discover""-p""test_*.py"
398 │ Directory: /home/dogiermann/Repos/caliper/test/ci_app_tests
399 │ "CI_app_tests" start time: Feb 24 18:18 CET
400 │ Output:
401 │ ----------------------------------------------------------
402 │ .......F..........................................................................
│ ................
403 │ ======================================================================
404 │ FAIL: test_mem_highwatermark_option (test_alloc_hooks.CaliperAllocServiceTest)
405 │ ----------------------------------------------------------------------
406 │ Traceback (most recent call last):
407 │ File "/home/dogiermann/Repos/caliper/test/ci_app_tests/test_alloc_hooks.py", lin
│ e 93, in test_mem_highwatermark_option
408 │ self.assertTrue(float(obj[1]['Allocated MB']) > 0.0)
409 │ AssertionError: False is not true
410 │
411 │ ----------------------------------------------------------------------
412 │ Ran 98 tests in 6.511s
413 │
414 │ FAILED (failures=1)
415 │ <end of output>
416 │ Test time = 6.59 sec
417 │ ----------------------------------------------------------
418 │ Test Failed.
419 │ "CI_app_tests" end time: Feb 24 18:18 CET
420 │ "CI_app_tests"time elapsed: 00:00:06
421 │ ----------------------------------------------------------
422 │
423 │ End testing: Feb 24 18:18 CET
I assume this is unrelated to the next issue. Here I basically followed the OpenMP guide of the docs (https://software.llnl.gov/Caliper/OpenMP.html). If no OpenMP code has been executed, then the resulting error message is very misleading. For example, when executing the serial MFEM caliper example without the "omp" device, then the following message is generated
CALI_USE_OMPT=1 CALI_CONFIG=openmp-report OMP_NUM_THREADS=16 ./ex1 ✔
Options used:
--mesh ../../data/star.mesh
--order 1
--no-static-condensation
--no-partial-assembly
--device cpu
--visualization
--caliper runtime-report
Device configuration: cpu
Memory configuration: host-std
[...]
Average reduction factor = 0.882852
Path Min time/rank Max time/rank Avg time/rank Time %
main 0.029643 0.029643 0.029643 4.200957
Save the results 0.039618 0.039618 0.039618 5.614597
Solve A X=B (FA) 0.000019 0.000019 0.000019 0.002693
PCG 0.379665 0.379665 0.379665 53.805492
Set up the bilinear form 0.167421 0.167421 0.167421 23.726625
Set up the linear form 0.068040 0.068040 0.068040 9.642516
BuildElementToDofTable 0.021202 0.021202 0.021202 3.004712
Construct 0.000017 0.000017 0.000017 0.002409
cali-query: TreeFormatter: Attribute "max#max#n.threads" not found.
cali-query: TreeFormatter: Attribute "avg#inclusive#t.initial" not found.
cali-query: TreeFormatter: Attribute "min#iratio#t.omp.work/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#iratio#t.omp.work/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#iratio#t.omp.sync/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "max#iratio#t.omp.sync/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#sum#t.omp.work" not found.
cali-query: TreeFormatter: Attribute "avg#sum#t.omp.sync" not found.
cali-query: TreeFormatter: Attribute "sum#sum#t.omp.work" not found.
cali-query: TreeFormatter: Attribute "sum#sum#t.omp.sync" not found.
Path Time (total)
main 0.716164
Destroy 0.000018
Save the results 0.039615
Solve A X=B (FA) 0.379684
PCG 0.379662
Set up the bilinear form 0.167417
Set up the linear form 0.068040
BuildElementToDofTable 0.021201
Construct 0.000020
== CALIPER: openmp-report: ompt: OMPT support was not enabled: Set the CALI_USE_OMPT environment variable to enable it (CALI_USE_OMPT=1)
With the parallel example at least part of the "error" is not reported (falsely):
CALI_USE_OMPT=1 CALI_CONFIG=openmp-report OMP_NUM_THREADS=16 mpirun -np 1 ex1p ✔ 10s
Options used:
--mesh ../../data/star.mesh
--order 1
--no-static-condensation
--no-partial-assembly
--device cpu
--visualization
--caliper runtime-report
Device configuration: cpu
Memory configuration: host-std
Number of finite element unknowns: 82561
Num MPI tasks = 1
Num OpenMP threads = 16
[...]
Path Min time/rank Max time/rank Avg time/rank Time %
main 0.149842 0.149842 0.149842 5.760205
Save the results 0.157360 0.157360 0.157360 6.049211
Solve A X=B 0.971888 0.971888 0.971888 37.361182
Set up the bilinear form 0.971716 0.971716 0.971716 37.354570
Set up the linear form 0.264361 0.264361 0.264361 10.162528
BuildElementToDofTable 0.083609 0.083609 0.083609 3.214085
Construct 0.000069 0.000069 0.000069 0.002652
ConstructTrueDofs 0.002486 0.002486 0.002486 0.095566
cali-query: TreeFormatter: Attribute "max#max#n.threads" not found.
cali-query: TreeFormatter: Attribute "min#iratio#t.omp.work/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#iratio#t.omp.work/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#iratio#t.omp.sync/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "max#iratio#t.omp.sync/t.omp.total" not found.
cali-query: TreeFormatter: Attribute "avg#sum#t.omp.work" not found.
cali-query: TreeFormatter: Attribute "avg#sum#t.omp.sync" not found.
cali-query: TreeFormatter: Attribute "sum#sum#t.omp.work" not found.
cali-query: TreeFormatter: Attribute "sum#sum#t.omp.sync" not found.
Path Time (thread) (avg) Time (total)
main 1.408351 2.603814
Save the results 0.157354 0.157354
Solve A X=B 0.971874 0.971874
Set up the bilinear form 0.278919 0.971712
Set up the linear form 0.264359
BuildElementToDofTable 0.083606
Construct 0.002560
ConstructTrueDofs 0.002485
So if I understand you correctly, you find the OMPT support was not enabled error message misleading if you did set the CALI_USE_OMPT environment variable but OpenMP was just never used in the program? I can see the point, though it's a rare case as openmp-report isn't very useful if you don't use OpenMP. We can probably disable the warning if CALI_USE_OMPT is set though.
The failing test likely have nothing to do with the OpenMP issues, but that is interesting nonetheless. What system are you running this on?
Yes, in combination with the "Attribute not found" I think that this message is misleading for two reasons. First, the support is actually enabled. Second, the message is not reported consistently across different configurations. Since I was tracking down some bugs I thought that I broke something related to OpenMP and OMPT.
While for me personally this is not too much of an issue, I think this could be an issue for users with not so much experience in software development. Furhter, I am not sure what the best way to "fix" this would be - or if this has to be adressed at all. I mean, it is a minor issue and maybe this discussion thread and a comment in the docs would also be enough for users running into this or similar issues. I just wanted to share the information I gathered.
Regarding the test failing, my system is a fresh Manjaro (I just set it up yesterday). What additional information do you need?
Dead Caliper developers,
first I have to thank for sharing such a fine profiling library!
Currently I am tracking down some strange performance degradation issues in MFEM with OpenMP+MPI based builds. Since my experience with profiling MFEM through Caliper was good for now and I was able to track down issues in the past pretty quickly with this suite, I decided to use this setup again. However, I run into troubles when using OpenMP+MPI (OpenMPI 5) in Caliper with LLVM 13.0.1.
My first issue is that with all configurations test 4 fails (v.2.6 and master). Here the tail:
I assume this is unrelated to the next issue. Here I basically followed the OpenMP guide of the docs (https://software.llnl.gov/Caliper/OpenMP.html). If no OpenMP code has been executed, then the resulting error message is very misleading. For example, when executing the serial MFEM caliper example without the "omp" device, then the following message is generated
This works correctly with:
With the parallel example at least part of the "error" is not reported (falsely):
My Caliper configuration is as follows:
I think it can be helpful for new users to provide the source for the "caliper-openmp-example", even if it is just for testing purposes.
The text was updated successfully, but these errors were encountered: