Stash datafile numpy arrays and concatenate once #57

zackcornelius · 2020-01-28T22:00:18Z

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list.

Concatenate all the xt and yt arrays after all datagen frames have been processed, to trigger memcopy only once.

Before this patch, p2b1_baseline_keras2.py on Haswell (Cooley at Argonne - E5-2620v3 x2, 384 GB RAM, K80 GPU) runs in 4590 seconds

After this patch, it runs in 3555 seconds, for a ~23% speedup.

In situations with limited memory bandwidth (such as when using Optane DC Memory, or external memory via the RAN project at Argonne), this would have a significantly higher impact.

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list. Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.

mseryn

Changes look good. Performance speedup is useful.

Stash datafile numpy arrays and concatenate once

cdd9c47

Avoid appending to xt_all and yt_all during datagen by stashing xt and yt arrays in a python list. Concatenate all the xt and yt arrays after all datagen frames have been processed, to triggere memcopy only once.

mseryn approved these changes Apr 13, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stash datafile numpy arrays and concatenate once #57

Stash datafile numpy arrays and concatenate once #57

zackcornelius commented Jan 28, 2020

mseryn left a comment

Stash datafile numpy arrays and concatenate once #57

Are you sure you want to change the base?

Stash datafile numpy arrays and concatenate once #57

Conversation

zackcornelius commented Jan 28, 2020

mseryn left a comment

Choose a reason for hiding this comment