Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/gaugefield unity #1384

Merged
merged 67 commits into from
Oct 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
727872d
Initial work towards unification of gauge fields. Replaced Gauge_p()…
maddyscientist Jun 21, 2022
7a7b65c
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Jul 16, 2022
53b7517
Improve error reporting when vol_string exceeds max size
maddyscientist Jul 21, 2022
c3fb2eb
Significant rework of memory allocation to facilitate gauge field uni…
maddyscientist May 8, 2023
21a9482
Merge branch 'develop' into feature/gaugefield_unity
maddyscientist May 10, 2023
8e0207e
Move gauge field exchange functions to GaugeField from cpu/cuda children
maddyscientist May 11, 2023
af7fb2c
Further steps towards gauge field unification (copy/load/save routine…
maddyscientist May 11, 2023
0671db1
Removed legacy load/save CPUField routines, replaced with GaugeField:…
maddyscientist May 11, 2023
f5e8eac
Removal of cpuGaugeField and cudaGaugeField, we have now only GaugeField
maddyscientist May 13, 2023
19aa064
Add null, move and copy constructors, as well as copy and move assign…
maddyscientist May 16, 2023
bc3dba0
Fix some issues with staggered quark smearing
maddyscientist May 16, 2023
13eb7e1
Fix HISQ force since unification, and renable hisq force ctests which…
maddyscientist May 18, 2023
e538fa0
Commenced use of new GaugeField features (default contructor, move an…
maddyscientist May 18, 2023
3db98e1
Continued to add auto profiling support and GaugeField cleanup to var…
maddyscientist May 19, 2023
a75fad1
More interface code related cleanup
maddyscientist May 22, 2023
52a1d1c
ColorSpinorField and CloverField now autoprofile any H2D and D2H tran…
maddyscientist May 22, 2023
442a460
Add qudaMemsetAsync and qudaMemcpy overloads for quda_ptr
maddyscientist May 26, 2023
44586cd
Use quda_ptr for both color_spinor_field.cpp and clover_field.cpp all…
maddyscientist May 26, 2023
ece19db
Fix clang warnings
maddyscientist May 26, 2023
f568585
Clean up and fix some bugs that creeped in
maddyscientist May 30, 2023
b15f94d
Update MRE solver to use getProfile
maddyscientist May 31, 2023
2f4c41d
Include some missing headers that broke jitify
maddyscientist May 31, 2023
0178ab5
Move contents of TimeProfile to timer.cpp to avoid breaking jitify.
maddyscientist May 31, 2023
bf14e68
Fixed for covdev_test
maddyscientist May 31, 2023
7bf774c
Update jitify to latest with some custom additions yet to be back ported
maddyscientist May 31, 2023
c42794f
Rename ColorSpinorField/CloverField::V methods to data, with an optio…
maddyscientist Jun 1, 2023
838ff4f
Fix clang warning
maddyscientist Jun 1, 2023
9aa20ce
Remove std::move on temporary quda_ptr objects since this prevents th…
maddyscientist Jun 1, 2023
08b99a5
Move quda_ptr to its own file, and make it generic
maddyscientist Jun 2, 2023
2516878
Add missing utility header
maddyscientist Jun 5, 2023
27badee
Fix issue with Wilson MG
maddyscientist Jun 23, 2023
71a5335
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Aug 11, 2023
dd66595
Removed unneeded static_cast
maddyscientist Aug 11, 2023
e818659
Fix HIP builds
maddyscientist Aug 12, 2023
50987b1
Minor review comment
maddyscientist Aug 18, 2023
a92296b
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Aug 18, 2023
f8b3244
Add default assignment operator for TimeProfile class
maddyscientist Aug 19, 2023
7a75dc7
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Aug 29, 2023
06d2dcb
Further cleanup and minor fixes
maddyscientist Aug 29, 2023
a61fbba
Fix issues with staggered_invert_test related to gauge-field unification
maddyscientist Aug 30, 2023
3963f63
Pushing a profile onto the stack is now handled using an auxiliary co…
maddyscientist Aug 30, 2023
426b59a
Respond to review comments
maddyscientist Aug 31, 2023
63e474d
Fix some overflow issues with large volumes
maddyscientist Sep 11, 2023
56a719d
Fix some overflow issues with tests
maddyscientist Sep 12, 2023
f14d7ff
Minor cleanup of heatbath_test and fix an issue found in testing with…
maddyscientist Sep 12, 2023
b19fe54
Fix typo
maddyscientist Sep 12, 2023
bf29f03
Fix typo
maddyscientist Sep 12, 2023
dc5ec21
Updates for quda_ptr: add custom exchange function since std::exchang…
maddyscientist Sep 21, 2023
8d6871e
Fix issues with move assignment with GaugeField and ColorSpinorField …
maddyscientist Sep 21, 2023
97ee4ee
Fix some residency issues found while testing MILC, use GaugeField::e…
maddyscientist Sep 21, 2023
69f7303
Fix #1406
maddyscientist Sep 21, 2023
8aac21a
Fix 32-bit overflow issue when sizing compressed gauge fields (Thanks…
maddyscientist Sep 21, 2023
bcef438
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Sep 27, 2023
415a443
LatticeFieldParam should set its location from QudaGaugeParam::location
maddyscientist Oct 3, 2023
b211699
Fix for QUDA_CTEST_LAUNCH
maddyscientist Oct 3, 2023
dfef80f
Fix for modern Fortran compilers
maddyscientist Oct 4, 2023
c5410be
When creating momentum field, always use periodic boundary conditions
maddyscientist Oct 4, 2023
23a6251
Merge branch 'develop' of github.com:lattice/quda into feature/gaugef…
maddyscientist Oct 6, 2023
4c308f6
Don't dereference nullptr when creating reference QDP fields
maddyscientist Oct 7, 2023
a16e51c
Prevent concurrent timers from running: check if a timer is already r…
maddyscientist Oct 11, 2023
85292b2
Cleanup of solver timing and flops handling: add global flops counter…
maddyscientist Oct 13, 2023
fe29798
Report MG setup time and performance in invert_test and staggered_inv…
maddyscientist Oct 13, 2023
d3649dd
Remove legacy blas flop and byte counting
maddyscientist Oct 16, 2023
0a413c7
Remove legacy Dirac flop counter and switch to using QUDA's global fl…
maddyscientist Oct 16, 2023
14b36bf
Don't count policy flops / bytes in the global counters to avoid doub…
maddyscientist Oct 16, 2023
4e8fb5d
Apply clang format
maddyscientist Oct 16, 2023
5e55e1e
More clang-format
maddyscientist Oct 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions include/blas_helper.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -111,9 +111,7 @@ namespace quda
{}

data_t(const ColorSpinorField &x) :
spinor(static_cast<store_t *>(const_cast<ColorSpinorField &>(x).V())),
stride(x.VolumeCB()),
cb_offset(x.Bytes() / (2 * sizeof(store_t) * N))
spinor(x.data<store_t *>()), stride(x.VolumeCB()), cb_offset(x.Bytes() / (2 * sizeof(store_t) * N))
{}
};

Expand Down Expand Up @@ -141,8 +139,8 @@ namespace quda
{}

data_t(const ColorSpinorField &x) :
spinor(static_cast<store_t *>(const_cast<ColorSpinorField &>(x).V())),
norm(static_cast<norm_t *>(const_cast<ColorSpinorField &>(x).Norm())),
spinor(x.data<store_t *>()),
norm(static_cast<norm_t *>(x.Norm())),
stride(x.VolumeCB()),
cb_offset(x.Bytes() / (2 * sizeof(store_t) * N)),
cb_norm_offset(x.Bytes() / (2 * sizeof(norm_t)))
Expand Down
5 changes: 1 addition & 4 deletions include/blas_quda.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,17 +23,14 @@ namespace quda {

void setParam(int kernel, int prec, int threads, int blocks);

extern unsigned long long flops;
extern unsigned long long bytes;

inline void zero(cvector_ref<ColorSpinorField> &x)
{
for (auto i = 0u; i < x.size(); i++) x[i].zero();
}

inline void copy(ColorSpinorField &dst, const ColorSpinorField &src)
{
if (dst.V() == src.V()) {
if (dst.data() == src.data()) {
// check the fields are equivalent else error
if (ColorSpinorField::are_compatible(dst, src))
return;
Expand Down
21 changes: 12 additions & 9 deletions include/clover_field.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,9 +178,10 @@ namespace quda {
int nColor = 0;
int nSpin = 0;

void *clover = nullptr;
void *cloverInv = nullptr;
quda_ptr clover = {};
quda_ptr cloverInv = {};

bool inverse = false;
double diagonal = 0.0;
array<double, 2> max = {};

Expand Down Expand Up @@ -213,12 +214,18 @@ namespace quda {

public:
CloverField(const CloverFieldParam &param);
virtual ~CloverField();

static CloverField *Create(const CloverFieldParam &param);

void* V(bool inverse=false) { return inverse ? cloverInv : clover; }
const void* V(bool inverse=false) const { return inverse ? cloverInv : clover; }
template <typename T = void *> auto data(bool inverse = false) const
{
return inverse ? reinterpret_cast<T>(cloverInv.data()) : reinterpret_cast<T>(clover.data());
maddyscientist marked this conversation as resolved.
Show resolved Hide resolved
}

/**
@return whether the inverse is explicitly been allocated
*/
bool Inverse() const { return inverse; }

/**
@return diagonal scaling factor applied to the identity
Expand Down Expand Up @@ -406,10 +413,6 @@ namespace quda {
*/
void copy_from_buffer(void *buffer);

friend class DiracClover;
friend class DiracCloverPC;
friend class DiracTwistedClover;
friend class DiracTwistedCloverPC;
};

/**
Expand Down
14 changes: 7 additions & 7 deletions include/clover_field_order.h
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ namespace quda {
static constexpr int N = nColor * nSpin / 2;
reconstruct_t<Float, N * N, clover::reconstruct()> recon;
FloatNAccessor(const CloverField &A, bool inverse = false) :
a(static_cast<Float *>(const_cast<void *>(A.V(inverse)))),
a(A.Bytes() ? A.data<Float *>(inverse) : nullptr),
stride(A.VolumeCB()),
offset_cb(A.Bytes() / (2 * sizeof(Float))),
compressed_block_size(A.compressed_block_size()),
Expand Down Expand Up @@ -403,7 +403,7 @@ namespace quda {
const int N = nSpin * nColor / 2;
const complex<Float> zero;
Accessor(const CloverField &A, bool inverse = false) :
a(static_cast<Float *>(const_cast<void *>(A.V(inverse)))),
a(A.Bytes() ? A.data<Float *>(inverse) : nullptr),
offset_cb(A.Bytes() / (2 * sizeof(Float))),
zero(complex<Float>(0.0, 0.0))
{
Expand Down Expand Up @@ -639,7 +639,7 @@ namespace quda {
if (clover.max_element(is_inverse) == 0.0 && isFixed<Float>::value)
errorQuda("%p max_element(%d) appears unset", &clover, is_inverse);
if (clover.Diagonal() == 0.0 && clover.Reconstruct()) errorQuda("%p diagonal appears unset", &clover);
this->clover = clover_ ? clover_ : (Float *)(clover.V(is_inverse));
this->clover = clover_ ? clover_ : clover.data<Float *>(is_inverse);
}

QudaTwistFlavorType TwistFlavor() const { return twist_flavor; }
Expand Down Expand Up @@ -844,7 +844,7 @@ namespace quda {
if (clover.Order() != QUDA_PACKED_CLOVER_ORDER) {
errorQuda("Invalid clover order %d for this accessor", clover.Order());
}
this->clover = clover_ ? clover_ : (Float *)(clover.V(inverse));
this->clover = clover_ ? clover_ : clover.data<Float *>(inverse);
}

QudaTwistFlavorType TwistFlavor() const { return twist_flavor; }
Expand Down Expand Up @@ -892,8 +892,8 @@ namespace quda {
if (clover.Order() != QUDA_QDPJIT_CLOVER_ORDER) {
errorQuda("Invalid clover order %d for this accessor", clover.Order());
}
offdiag = clover_ ? ((Float **)clover_)[0] : ((Float **)clover.V(inverse))[0];
diag = clover_ ? ((Float **)clover_)[1] : ((Float **)clover.V(inverse))[1];
offdiag = clover_ ? ((Float **)clover_)[0] : clover.data<Float **>(inverse)[0];
diag = clover_ ? ((Float **)clover_)[1] : clover.data<Float **>(inverse)[1];
}

QudaTwistFlavorType TwistFlavor() const { return twist_flavor; }
Expand Down Expand Up @@ -970,7 +970,7 @@ namespace quda {
if (clover.Order() != QUDA_BQCD_CLOVER_ORDER) {
errorQuda("Invalid clover order %d for this accessor", clover.Order());
}
this->clover[0] = clover_ ? clover_ : (Float *)(clover.V(inverse));
this->clover[0] = clover_ ? clover_ : clover.data<Float *>(inverse);
this->clover[1] = (Float *)((char *)this->clover[0] + clover.Bytes() / 2);
}

Expand Down
49 changes: 8 additions & 41 deletions include/color_spinor_field.h
Original file line number Diff line number Diff line change
Expand Up @@ -121,18 +121,13 @@ namespace quda
}
};

class ColorSpinorParam : public LatticeFieldParam
{

public:
struct ColorSpinorParam : public LatticeFieldParam {
int nColor = 0; // Number of colors of the field
int nSpin = 0; // =1 for staggered, =2 for coarse Dslash, =4 for 4d spinor
int nVec = 1; // number of packed vectors (for multigrid transfer operator)

QudaTwistFlavorType twistFlavor = QUDA_TWIST_INVALID; // used by twisted mass

QudaSiteOrder siteOrder = QUDA_INVALID_SITE_ORDER; // defined for full fields

QudaFieldOrder fieldOrder = QUDA_INVALID_FIELD_ORDER; // Float, Float2, Float4 etc.
QudaGammaBasis gammaBasis = QUDA_INVALID_GAMMA_BASIS;
QudaFieldCreate create = QUDA_INVALID_FIELD_CREATE;
Expand Down Expand Up @@ -179,7 +174,6 @@ namespace quda
ColorSpinorParam() = default;

// used to create cpu params

ColorSpinorParam(void *V, QudaInvertParam &inv_param, const lat_dim_t &X, const bool pc_solution,
QudaFieldLocation location = QUDA_CPU_FIELD_LOCATION) :
LatticeFieldParam(4, X, 0, location, inv_param.cpu_prec),
Expand All @@ -188,20 +182,12 @@ namespace quda
|| inv_param.dslash_type == QUDA_LAPLACE_DSLASH) ?
1 :
4),
nVec(1),
twistFlavor(inv_param.twist_flavor),
siteOrder(QUDA_INVALID_SITE_ORDER),
fieldOrder(QUDA_INVALID_FIELD_ORDER),
gammaBasis(inv_param.gamma_basis),
create(QUDA_REFERENCE_FIELD_CREATE),
pc_type(inv_param.dslash_type == QUDA_DOMAIN_WALL_DSLASH ? QUDA_5D_PC : QUDA_4D_PC),
v(V),
is_composite(false),
composite_dim(0),
is_component(false),
component_id(0)
v(V)
{

if (nDim > QUDA_MAX_DIM) errorQuda("Number of dimensions too great");
for (int d = 0; d < nDim; d++) x[d] = X[d];

Expand Down Expand Up @@ -343,8 +329,7 @@ namespace quda

size_t length = 0; // length including pads, but not norm zone

void *v = nullptr; // the field elements
void *v_h = nullptr; // the field elements
quda_ptr v = {}; // the field elements
size_t norm_offset = 0; /** offset to the norm (if applicable) */

// multi-GPU parameters
Expand Down Expand Up @@ -477,37 +462,19 @@ namespace quda
/**
@brief Return pointer to the field allocation
*/
void *V()
{
if (ghost_only) errorQuda("Not defined for ghost-only field");
return v;
}

/**
@brief Return pointer to the field allocation
*/
const void *V() const
{
if (ghost_only) errorQuda("Not defined for ghost-only field");
return v;
}

/**
@brief Return pointer to the norm base pointer in the field allocation
*/
void *Norm()
template <typename T = void *> auto data() const
{
if (ghost_only) errorQuda("Not defined for ghost-only field");
return static_cast<char *>(v) + norm_offset;
return reinterpret_cast<T>(v.data());
}

/**
@brief Return pointer to the norm base pointer in the field allocation
*/
const void *Norm() const
void *Norm() const
{
if (ghost_only) errorQuda("Not defined for ghost-only field");
return static_cast<char *>(v) + norm_offset;
return static_cast<char *>(v.data()) + norm_offset;
}

size_t NormOffset() const { return norm_offset; }
Expand Down Expand Up @@ -938,7 +905,7 @@ namespace quda
static void test_compatible_weak(const ColorSpinorField &a, const ColorSpinorField &b);

friend std::ostream &operator<<(std::ostream &out, const ColorSpinorField &);
friend class ColorSpinorParam;
friend struct ColorSpinorParam;
};

/**
Expand Down
17 changes: 8 additions & 9 deletions include/color_spinor_field_order.h
Original file line number Diff line number Diff line change
Expand Up @@ -877,14 +877,13 @@ namespace quda
FieldOrderCB(const ColorSpinorField &field, int nFace = 1, void *const v_ = 0, void *const *ghost_ = 0) :
GhostOrder(field, nFace, ghost_), volumeCB(field.VolumeCB()), accessor(field)
{
v.v = v_ ? static_cast<complex<storeFloat> *>(const_cast<void *>(v_)) :
static_cast<complex<storeFloat> *>(const_cast<void *>(field.V()));
v.v = v_ ? static_cast<complex<storeFloat> *>(const_cast<void *>(v_)) : field.data<complex<storeFloat> *>();
resetScale(field.Scale());

if constexpr (fixed && block_float) {
if constexpr (nColor == 3 && nSpin == 1 && nVec == 1 && order == 2)
// special case where the norm is packed into the per site struct
v.norm = reinterpret_cast<norm_t *>(const_cast<void *>(field.V()));
v.norm = field.data<norm_t *>();
else
v.norm = static_cast<norm_t *>(const_cast<void *>(field.Norm()));
v.norm_offset = field.Bytes() / (2 * sizeof(norm_t));
Expand Down Expand Up @@ -1088,7 +1087,7 @@ namespace quda
size_t bytes;

FloatNOrder(const ColorSpinorField &a, int nFace = 1, Float *buffer = 0, Float **ghost_ = 0) :
field(buffer ? buffer : (Float *)a.V()),
field(buffer ? buffer : a.data<Float *>()),
norm(buffer ? reinterpret_cast<norm_type *>(reinterpret_cast<char *>(buffer) + a.NormOffset()) :
const_cast<norm_type *>(reinterpret_cast<const norm_type *>(a.Norm()))),
offset(a.Bytes() / (2 * sizeof(Float) * N)),
Expand Down Expand Up @@ -1316,7 +1315,7 @@ namespace quda
size_t bytes;

FloatNOrder(const ColorSpinorField &a, int nFace = 1, Float *buffer = 0, Float **ghost_ = 0) :
field(buffer ? buffer : (Float *)a.V()),
field(buffer ? buffer : a.data<Float *>()),
offset(a.Bytes() / (2 * sizeof(Vector))),
volumeCB(a.VolumeCB()),
nParity(a.SiteSubset()),
Expand Down Expand Up @@ -1505,7 +1504,7 @@ namespace quda
int faceVolumeCB[4];
int nParity;
SpaceColorSpinorOrder(const ColorSpinorField &a, int nFace = 1, Float *field_ = 0, float * = 0, Float **ghost_ = 0) :
field(field_ ? field_ : (Float *)a.V()),
field(field_ ? field_ : a.data<Float *>()),
offset(a.Bytes() / (2 * sizeof(Float))),
volumeCB(a.VolumeCB()),
nParity(a.SiteSubset())
Expand Down Expand Up @@ -1589,7 +1588,7 @@ namespace quda
int faceVolumeCB[4];
int nParity;
SpaceSpinorColorOrder(const ColorSpinorField &a, int nFace = 1, Float *field_ = 0, float * = 0, Float **ghost_ = 0) :
field(field_ ? field_ : (Float *)a.V()),
field(field_ ? field_ : a.data<Float *>()),
offset(a.Bytes() / (2 * sizeof(Float))),
volumeCB(a.VolumeCB()),
nParity(a.SiteSubset())
Expand Down Expand Up @@ -1668,7 +1667,7 @@ namespace quda
int exDim[4]; // full field dimensions
PaddedSpaceSpinorColorOrder(const ColorSpinorField &a, int nFace = 1, Float *field_ = 0, float * = 0,
Float **ghost_ = 0) :
field(field_ ? field_ : (Float *)a.V()),
field(field_ ? field_ : a.data<Float *>()),
volumeCB(a.VolumeCB()),
exVolumeCB(1),
nParity(a.SiteSubset()),
Expand Down Expand Up @@ -1763,7 +1762,7 @@ namespace quda
int volumeCB;
int nParity;
QDPJITDiracOrder(const ColorSpinorField &a, int = 1, Float *field_ = 0, float * = 0) :
field(field_ ? field_ : (Float *)a.V()), volumeCB(a.VolumeCB()), nParity(a.SiteSubset())
field(field_ ? field_ : a.data<Float *>()), volumeCB(a.VolumeCB()), nParity(a.SiteSubset())
{
}

Expand Down
Loading
Loading