Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stash datafile numpy arrays and concatenate once #57

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zackcornelius
Copy link

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.

Concatenate all the xt and yt arrays after all datagen frames have been processed, to trigger memcopy only once.

Before this patch, p2b1_baseline_keras2.py on Haswell (Cooley at Argonne - E5-2620v3 x2, 384 GB RAM, K80 GPU) runs in 4590 seconds

After this patch, it runs in 3555 seconds, for a ~23% speedup.

In situations with limited memory bandwidth (such as when using Optane DC Memory, or external memory via the RAN project at Argonne), this would have a significantly higher impact.

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.

Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.
Copy link

@mseryn mseryn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Performance speedup is useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants