-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PARTFILE
support for near-null vectors (and eigenvectors)
#1398
Conversation
… format, plus exposed it on the command line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great addition. A couple of things:
io_test
needs to be extended to test thePARTFILE
saving. Of course this will only be non-trivial when running on multiple processes, but that's fine.- Should the
QudaBoolean
addition in the interface instead just bebool
? We already implicitly require C99 support, so I see no reason not to just usebool
. While in the long term we'll want to remove theQudaBoolean
for legacy interface options, perhaps now we draw a line in the sand and just usebool
for new additions?
re: As for just using |
Fair enough regarding |
This PR exposes the ability to save near-null vectors (and eigenvectors) in QIO's
PARTFILE
format, which is one file per MPI rank. The primary purpose of this is to speed up the saving (and loading) of near-null vectors during MG when tuning the algorithm, but it can also be used (very effectively) in production runs so long as you can assume the process decomposition will not change between runs.A description of a
PARTFILE
workflow where files are stored to per-node local scratch disks, copied to the network drive after the run---and then the process is run in reverse on later runs---has already been documented on the QUDA wiki here.This is threaded through the test executables via the flags
--mg-save-partfile
and--eig-save-partfile
, as well as through the MILC MG interface.Of note: there is no need for an analogous "loading" flag because QIO will automatically look for singlefile, then partfile, versions of a file on the load. There is also no functional reason why this can't be added for gauge fields as well, there is just far less of a use case (and much more risk for confusion).
This has been verified to give a speedup for 144^3x288 HISQ MG workflows on Selene where saving 64 fine-level near-null vectors goes from taking ~144 seconds to ~6 seconds. While I don't have the allocation to perform fresh timings on other machines, historically I have seen the analogous save take up to an hour on Summit; it's expected this would be much faster with the on-node SSDs.
io_test
clang-format