Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use size_t in mr2d_malloc instead of index-bound Int #85

Merged
merged 4 commits into from
Aug 6, 2024

Conversation

gdonval
Copy link
Contributor

@gdonval gdonval commented Oct 3, 2023

In C, the type of the number of bytes passed to malloc is of type size_t.

The current code ties that number of bytes to the general "Index type" (Int) which is defined at compile time and does fail on 32-bits systems setting Int as 64-bits if the number exceeds the 32-bit range.

Additionally, this artificially limits 64-bit-element aux matrices to 2GB instead of 16GB with 32-bits signed indices. This is a limit I have reached.

Changing the type back to size_t fixes all of that.

In C, the type of the number of bytes passed to `malloc` *is* of type `size_t`.

The current code ties that number of bytes to the general "Index type" (`Int`) which is defined at compile time and does fail on 32-bits systems setting `Int` as 64-bits.

Additionally, this artificially limits 64-bit-element aux matrices to 2GB instead of 16GB with 32-bits signed indices.
@gdonval
Copy link
Contributor Author

gdonval commented Oct 3, 2023

At least at home, all tests besides the two listed in issue #69 pass (make test). I hope these tests cover what is in REDIST.

@gdonval gdonval marked this pull request as draft October 3, 2023 16:51
@gdonval
Copy link
Contributor Author

gdonval commented Oct 3, 2023

My NEGFLAG is wrong. I'll fix that tomorrow. Is there a special command to use to test the REDIST part of the xode or is it just not tested?

@gdonval gdonval marked this pull request as ready for review October 4, 2023 10:56
@gdonval
Copy link
Contributor Author

gdonval commented Oct 4, 2023

@julielangou Can I kindly ask for you opinion on this as a maintainer?

The gist of it is that we have this signature mr2d_malloc(Int n), bounding the internal malloc to Int, which is the indexing type.

This PR operates this transformation: mr2d_malloc(Int n) -> mr2d_malloc(size_t n), yet ensures Int -> size_t conversion does not involve Int-negative values and ensures that no 64-bit values are passed to malloc on 32-bit systems.

The main advantage of this on 64-bit system is allowing the use of the full signed 32-bit indexing range instead of range / element size. I.e. unless have made a mistake the max AUX matrix size is now 16GB instead of 2GB previously.

gdonval added a commit to gdonval/scalapack-feedstock that referenced this pull request Oct 5, 2023
The gist of it is that we have this signature `mr2d_malloc(Int n)`,
bounding the internal `malloc` to `Int`, which is Scalapack's indexing type.
This is not how `malloc` is defined and means that maximum AUX matrix
size is artifically limited on 64-bit systems to 2GB.

This PR operates this transformation: `mr2d_malloc(Int n) -> mr2d_malloc(size_t n)`,
yet ensures `Int -> size_t` conversion does not involve Int-negative values
and ensures that no 64-bit values are passed to `malloc` on 32-bit systems.

The main advantage of this on 64-bit system allows the use of
the **full** _signed_ 32-bit indexing range instead of `range / element size`.
E.g. the max AUX matrix size is now 16GB instead of 2GB previously.

With this, "standard" 32-bit Scalapack and Blas/Lapack can still be used
in programs like GPAW.

Details (and full commit history) are in
Reference-ScaLAPACK/scalapack#85
which does not seem to receive much attention.

This patch is a way to provide that feature to conda users in the meantime.
@bE554357
Copy link

bE554357 commented Aug 6, 2024

Hello, @gdonval, @julielangou and @langou. I am asking you about status of this MR? Do one need to make fixes to it's code or can it be merged as is?

Pgemr2d is important function for some applications, which rely of scalapack SVD decomposition and some other functions. Current implementation does not allow to solve problems with large input sizes with optimal MPI configurations.

@gdonval
Copy link
Contributor Author

gdonval commented Aug 6, 2024

If tests still pass, this should be fine.

@langou langou merged commit 25935e1 into Reference-ScaLAPACK:master Aug 6, 2024
minrk pushed a commit to gdonval/scalapack-feedstock that referenced this pull request Nov 1, 2024
The gist of it is that we have this signature `mr2d_malloc(Int n)`,
bounding the internal `malloc` to `Int`, which is Scalapack's indexing type.
This is not how `malloc` is defined and means that maximum AUX matrix
size is artifically limited on 64-bit systems to 2GB.

This PR operates this transformation: `mr2d_malloc(Int n) -> mr2d_malloc(size_t n)`,
yet ensures `Int -> size_t` conversion does not involve Int-negative values
and ensures that no 64-bit values are passed to `malloc` on 32-bit systems.

The main advantage of this on 64-bit system allows the use of
the **full** _signed_ 32-bit indexing range instead of `range / element size`.
E.g. the max AUX matrix size is now 16GB instead of 2GB previously.

With this, "standard" 32-bit Scalapack and Blas/Lapack can still be used
in programs like GPAW.

Details (and full commit history) are in
Reference-ScaLAPACK/scalapack#85
which does not seem to receive much attention.

This patch is a way to provide that feature to conda users in the meantime.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants