-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/gaugefield unity #1384
Feature/gaugefield unity #1384
Conversation
…method with a better replacement named data()
…fication. Introduced new memory allocation wrapper quda_ptr, which is deployed for gauge field allocations. Still a WIP
…ment operators for GaugeField
… were accidentally not being run
…d copy assigment) to clean up interface_quda.cpp. Added new profile stack to allow for autoprofiling while also dramatically reducing LOC in the interface. Work in progress
…ious QUDA interfaces. Add ref counting support to the profiling, to allow for multiple starts without throwing an error: if a timer has already been started we simply increment the ref counter and return. Profiling now performs a device sync if the type is H2D, D2H or COMPUTE: this negates the need to use explicit synchronization and ensures accurate profiling
…sfers. Further interface cleanup
…ocations. Some cleanup
…nal template cast type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have verified (again) that CPS works good with this branch.
Tentative approve, I noticed a few cosmetic things, and this needs a fresh merge of |
@SaltyChiang I have merged in your changes into this branch, that will shortly be merged into develop. One line I have not included is this one:
as this line doesn't make sense to me. If it's a reference to a pre-existing field, why should the value be a Thanks |
Hi @maddyscientist, there are some Line 3873 in 6e90b49
Line 3903 in 6e90b49
ulink is nullptr here, the unitarized fat7 link will not be calculated, which is right; but cpuUnitarizedLink defined in Line 3847 in 6e90b49
((void **)nullptr)[d] . A null QUDA_QDP_GAUGE_ORDER field should be something like void *ulink[4] = {nullptr, nullptr, nullptr, nullptr} but now the if (ulink) will return true which we do not expect.
So I made all four reference fields |
@SaltyChiang thanks for the explanation. I have fixed my branch accordingly (4c308f6), so it should work for your usecase. If you could test this branch, that would be appreciated, thanks. |
…unning, and if so push it to the stack, and restore after the newly started timer is stopped. Fixes timing issues as noted by Jiqun
… which is incremented whenever tuneLaunch is called; for solver gflops and timing, we now compute the time and gflop between pushing the present interface profile, this now ensures we include all operations and includes upload/download time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @maddyscientist !
Hi @maddyscientist, very sorry for the late feedback. I think it should be Lines 3748 to 3749 in 0c016af
Edit: I believe Lines 765 to 774 in 0c016af
|
@SaltyChiang thanks for pointing put my bug. I've pushed a fix to my next PR that will be merged (#1393). This is a fairly simple PR, so will likely be merged soon. |
@maddyscientist thanks for the fix. You might miss another issue that will happen while saving the smeared gauge through saveGaugeQuda(). I updated the comment above and I repeat it here.
|
Thanks again @SaltyChiang. Fix is here 44c4100 (in #1393). |
…uge field regardless if is a pointer or array of pointers. Fixes callMultiSrcQuda (broke with #1384).
This PR's primary purpose is to unify the
cpuGaugeField
andcudaGaugeField
classes, along with an assortment of other cleanups.cpuGaugeField
andcudaGaugeField
classes, we have now onlyGaugeField
ColorSpinorField
the communiciation routines for host and device are still separate within theGaugeField
class, unification of these is deferred until laterGaugeField
to facilitate significant cleanupdata()
method. This is to align with STL design. Though here we also have an optional template parameter that can be used to perform a cast to the desired type. (Also applied toColorSpinorField
andCloverField
)quda_ptr
, which provides heterogenous memory allocationGaugeField
,ColorSpinorField
andCloverField
qudaMemcpy
andqudaMemset
functions that accept aquda_ptr
pushProfile
andpopProfile
functions, together withgetProfile
for getting a reference to the current top of stack profile.Outstanding items
quda_ptr
for ROCm