Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile problems on Cori #603

Closed
petschge opened this issue Nov 13, 2019 · 3 comments
Closed

Compile problems on Cori #603

petschge opened this issue Nov 13, 2019 · 3 comments
Assignees
Labels
install machine/system machine & HPC system specific issues question

Comments

@petschge
Copy link

Hi,

I am trying to compile openPMD on Cori to test performance of HDF5 and Adios2. (Based on boasting claims by Axel that I should throw away my own HDF5 code, because openPMD is easier and Adios2 is faster anyway).

Since spack is not available on this machine I decided to build from source and I checked out openPMD using `git clone https://github.com/openPMD/openPMD-api.git.

The set of module I have loaded is
`module list
Currently Loaded Modulefiles:

  1. modules/3.2.11.1 14) dvs/2.11_2.2.140-7.0.0.1_13.5__gdf9ebba2
  2. nsg/1.2.0 15) alps/6.6.50-7.0.0.1_3.44__g962f7108.ari
  3. intel/19.0.3.199 16) rca/2.2.20-7.0.0.1_4.42__g8e3fb5b.ari
  4. craype-network-aries 17) atp/2.1.3
  5. craype/2.5.18 18) PrgEnv-intel/6.0.5
  6. cray-libsci/19.02.1 19) craype-haswell
  7. udreg/2.3.2-7.0.0.1_4.31__g8175d3d.ari 20) cray-mpich/7.7.6
  8. ugni/6.0.14.0-7.0.0.1_7.35__ge78e5b0.ari 21) craype-hugepages2M
  9. pmi/5.0.14 22) altd/2.0
  10. dmapp/7.1.1-7.0.0.1_5.29__g25e5077.ari 23) darshan/3.1.7
  11. gni-headers/5.0.12.0-7.0.0.1_7.46__g3b1768f.ari 24) cray-hdf5-parallel/1.10.2.0
  12. xpmem/2.2.17-7.0.0.1_3.28__g7acee3a.ari 25) cmake/3.14.4
  13. job/2.2.4-7.0.0.1_3.36__g36b56f4.ari 26) adios2/2.4.0`

Note the presence of the adios2 and the hdf5-parallel modules.

I then ran openPMD-api/build> cmake -DopenPMD_USE_ADIOS2=ON ..

which produced at the end (let me know if you need the intermediate part that I snipped here):
`openPMD build configuration:
library Version: 0.10.0
openPMD Standard: 1.1.0
C++ Compiler: Intel 19.0.0.20190206 CrayPrgEnv
/opt/cray/pe/craype/2.5.18/bin/CC

Installation prefix: /usr/local
bin: bin
lib: lib64
include: include
cmake: lib64/cmake/openPMD
python: lib64/python3.6/site-packages

Additionally, install following third party libraries:
MPark.Variant: ON

Build Type: Release
Library: static
Testing: ON
Invasive Tests: OFF
Internal VERIFY: ON
Build Options:
MPI: ON
HDF5: ON
ADIOS1: OFF
ADIOS2: ON
PYTHON: ON`

cmake --build . made slow but steady progress until it hit

[ 37%] Building CXX object CMakeFiles/openPMD.dir/src/IO/InvalidatableFile.cpp.o [ 38%] Linking CXX static library lib/libopenPMD.a [ 38%] Built target openPMD Scanning dependencies of target 7_extended_write_serial [ 39%] Building CXX object CMakeFiles/7_extended_write_serial.dir/examples/7_extended_write_serial.cpp.o [ 41%] Linking CXX executable bin/7_extended_write_serial /usr/bin/ld: attempted static link of dynamic object /global/common/sw/cray/cnl7/haswell/adios2/2.4.0/intel/19.0.3.199/26fprnf/lib64/libadios2.so.2.4.0'
gmake[2]: *** [CMakeFiles/7_extended_write_serial.dir/build.make:86: bin/7_extended_write_serial] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:73: CMakeFiles/7_extended_write_serial.dir/all] Error 2
gmake: *** [Makefile:141: all] Error 2`

why the hell is openPMD trying to build Library: static despite the documentation saying By default, this will build as a shared library and why is it trying to staically link a dynamic object?

@ax3l
Copy link
Member

ax3l commented Nov 13, 2019

Hi,

thanks for opening an issue.
Cori is an "older" Cray cluster that has not yet defaulted to dynamic linking. I documented detailed instructions in WarpX, but long story short, you have to set export CRAYPE_LINK_TYPE=dynamic for the Spack modules and the dynamic libs for ADIOS1 MPI-wrapping to build:

module swap craype-haswell craype-mic-knl
module swap PrgEnv-intel PrgEnv-gnu
module load cmake/3.14.4
module load cray-hdf5-parallel
module load adios/1.13.1
export CRAYPE_LINK_TYPE=dynamic

git clone https://github.com/openPMD/openPMD-api.git
mkdir openPMD-api-build
cd openPMD-api-build
cmake ../openPMD-api -DopenPMD_USE_PYTHON=OFF -DCMAKE_INSTALL_PREFIX=../openPMD-install/ -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_RPATH='$ORIGIN'
cmake --build . --target install

That said, I have verified performance with HDF5 vs. ADIOS1 in the past via PIConGPU on Titan and HZDR-local systems. Note: we are just getting starting with openPMD-api performance tuning. openPMD-api is a re-implementation of our manual HDF5 & ADIOS1 routines and provides generalization of these PIConGPU routines for the broader community (and us).

So if we can find any performance issues - please do not hesitate to report here! Open for contributions are, e.g. #578 and please use the latest dev branch and check lastest in the manual :)

Be aware: ADIOS2 support is brand new and unreleased, so maybe set also export OPENPMD_BP_BACKEND="ADIOS1" for now. We are just getting started to run our own benchmarks on Cori, so you are overtaking our "adoption tests" of the system right now ;)

Based on boasting claims by Axel that I should throw away my own HDF5 code, because openPMD is easier and Adios2 is faster anyway

Haha, that might be subtext with different words, but generally okay :D Where did you see me claiming that last time, in very mild and politically correct words? ICNSP/APS? ;)

Looking forward to your results!

cc @guj

P.S.: The openPMD-api 0.10.0 release will get a request for a module out on Cori, which should be coming out within a month.

@ax3l ax3l self-assigned this Nov 13, 2019
@ax3l ax3l added the machine/system machine & HPC system specific issues label Nov 13, 2019
@petschge
Copy link
Author

Thanks for the quick and helpful answer.

And please excuse the sass, I was grumpy after spending hours and getting nowhere.

export CRAYPE_LINK_TYPE=dynamic
cmake -DopenPMD_USE_ADIOS2=ON -DopenPMD_USE_PYTHON=OFF -DCMAKE_INSTALL_PREFIX=../install/ -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=ON -DCMAKE_INSTALL_RPATH='$ORIGIN' ..

did indeed build. On to the next steps...

@ax3l
Copy link
Member

ax3l commented Nov 14, 2019

Sorry for that, should be more prominently in the Cori system docs. You are not the first one bitten by that:
LLNL/GOTCHA#41
https://confluence.slac.stanford.edu/display/~heather/Building+CCL+at+NERSC
https://www.nersc.gov/assets/Uploads/04-PE-Compilations-NewUserTraining-20190621.pdf

Modules you are using are build with Spack which sets CRAYPE_LINK_TYPE=dynamic already for the dependencies. NERSC system admins just do not want to break workflows for users that already relied on static-only-cray.

And dynamic linking is "super new" for Cray systems, I mean it was introduced with 2012/13 on Titan, so why adopt such hot sw-level features so quickly, right? ;)

That said: openPMD-api can be build fully static (including all dependencies) as long as ADIOS1 is not needed. ADIOS1 is itself split in two static libs (mpi/non-mpi) but does some sub-ideal MPI-mocking, which I work-around for usability with a shared-library wrapper. ornladios/ADIOS#183 That problem is solved in ADIOS2.
Anyway, won't help you on Cori yet unless you build your own static HDF5 and static ADIOS2 libraries to link against.

Just so you know the background story. I always like background stories when I felt the pain...

But hold for the plot twist: in Dec/2019 dynamic linking will be automatically enabled on Cori.
https://docs.nersc.gov/systems/cori/timeline/default_PE_history/2019Nov/
I guess you were just ahead of your time :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
install machine/system machine & HPC system specific issues question
Projects
None yet
Development

No branches or pull requests

2 participants