Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make goals use C++20 coroutines #11005

Merged
merged 18 commits into from
Jul 15, 2024
Merged

Make goals use C++20 coroutines #11005

merged 18 commits into from
Jul 15, 2024

Conversation

L-as
Copy link
Member

@L-as L-as commented Jul 1, 2024

Part of building refactor WG work. I don't necessarily expect to merge the code in the current state, but it's not really a draft since I think it works. There might be bugs hiding somewhere though. OTOH bugs we haven't discovered yet might have been removed by this PR.

Motivation

Goals before this PR do coroutine-like things explicitly
using state and setting function pointers.
Everytime work() is called, the state function pointer is called.
The control flow resulting from this design is not very obvious (for me personally at least).
One consequence of this design is that any state that has to be maintained between
different states has to be in the goal class, such that all variables that could
ever be used are all fields of the class.

It is because of this not clear what fields are used when,
and more importantly, which are initialized in which states.

It is further complicated by the fact that if you do not set the state,
it implicitly loops.

It is not a "nice" way of writing the goal logic.

C++20 however has coroutines, which can do much of what we do manually, automatically.

This PR is an experiment in that direction, seeing how goal logic would look
using coroutines.

Context

Reference for learning about coroutines:

My own as short as possible explanation:
Coroutines are functions that use co_await/co_return.
When you run the coroutine function, a coroutine handle is created,
alongside its user-defined promise type.
There are suspension points at the beginning, at every co_await,
and at the final (possibly implicit) co_return.
Once suspended, you can resume the coroutine by doing coroutine_handle.resume().
Suspension points are implemented by passing a struct to the compiler
that implements await_suspend.
await_suspend can either say "cancel suspension", in which case execution resumes,
"suspend", in which case control is passed back to the caller of coroutine_handle.resume()
or the place where the coroutine function is initially executed in the case of the initial
suspension, or await_suspend can specify another coroutine to jump to, which is
how tail calls are implemented.


I've tried to roughly make every state into a coroutine.
Where before you'd do state = &f; return, you now do co_await SuspendGoal{}; co_return f().
However, you can also do co_await f(), i.e. a "coroutine" call!

Places where a suspension would happen implicitly (i.e. return-ing from the function),
are now explicit by doing co_await SuspendGoal{}.

Each goal at all times keeps track of the top_co, i.e. the coroutine last (possibly now) executed.
When we co_return another coroutine, i.e. do a "coroutine" tail call, we set the top_co,
and switch to it. Our current coroutine is destroyed.

When we co_await another coroutine, we swap top_co, and set the previous top_co,
i.e. the coroutine being currently executed, as the continuation of the coroutine to be called.

Once a coroutine has finished executing (at co_return), we check whether the goal is still busy
(is status ecBusy?), and if so, we jump to the continuation at the final suspension point.
If not, we suspend the coroutine and ignore the continuation, since the goal has finished.
The goal should be destroyed by someone else after this point.

Coroutines are wrapped in Co to ensure automatic destruction.

Currently, there is the limitation that our coroutines (Co) can not return a value,
but this should be trivial to implement using templates, or you could just pass in a pointer
to the coroutine and "return" the value that way.
It has however not been necessary yet.

I've rewritten most of the code mechanically.
DrvOutputSubstitionGoal I've used the most effort on, and it's an example
of how the logic can be heavily simplified.
The intermediate variables no longer have to be fields of the class and aren't in the
header anymore.

Alternatives

You could have heavily simplified the code without coroutines too.
You could have made state into a function pointer with state,
i.e. a class, to improve the legibility of the control flow,
and also prevent implicit looping by making the state return the
next state rather than setting a field that already has a value.

You could also have reimplemented the code in another language entirely.

I deemed both of these suboptimal. We're already using C++, so why not use
all of the features. Coroutines are mostly well supported by all major compilers
AFAICT, albeit GCC didn't support a feature that would reduce complexity by a small
amount (getting the this of coroutine methods in promise_type constructor).

There are obviously better languages than C++, but deciding on a new language
is a social problem as well as a technical one. There also isn't a language
that is better than C++(20) on all fronts (e.g. bootstrappability, multiple implementations,
has a specification).

I feel that we can heavily improve the quality of the Nix codebase without switching from C++.

We are probably better off having multiple implementations of Nix anyway in my own humble opinion.

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@L-as L-as requested a review from Ericson2314 as a code owner July 1, 2024 15:57
@L-as
Copy link
Member Author

L-as commented Jul 1, 2024

@Ericson2314 @rhelmot

@L-as
Copy link
Member Author

L-as commented Jul 1, 2024

I probably need to add more comments. I probably also need to fix the formatting in some places, might have used 2-space indentation by accident in some places.

@L-as L-as requested a review from edolstra as a code owner July 15, 2024 15:19
@github-actions github-actions bot added new-cli Relating to the "nix" command with-tests Issues related to testing. PRs with tests have some priority labels Jul 15, 2024
@L-as L-as marked this pull request as draft July 15, 2024 17:18
@L-as
Copy link
Member Author

L-as commented Jul 15, 2024

Depends on #11073

L-as added 16 commits July 15, 2024 17:54
I dropped the goal of using the implicit HALO optimization,
i.e. allocating the coroutine's frame within the caller's frame.
This isn't really feasible with way C++ coroutines are designed
while also supporting tail-calls.

You could do some trampoline-thing to make it work,
but for the future we can also make the allocation explicitly
in the caller frame by overloading operator new.

I also did some minor refactoring, and the code is simpler now.
GCC seemingly doesn't support promise type constructor,
so we avoid using it.
We were using it for getting access to the Goal,
i.e. the `this` of the coroutine, but we avoid this
by passing it once explicitly when initializing the root
coroutine.
Then, on co_await/co_return, we get the goal from the calling coroutine.

We also add some useful assertion in a bunch of places.
From feedback from @Ericson2314.
Comments from goal.cc have been moved to goal.hh
and made doc comments.

Unnecessary lines of code that don't affect functionality have
also been removed.

Commented out assertion was uncommented-out.
Uses the same style as DrvOutputSubstitutionGoal roughly.
…e instead of Co

Logic for final_awaiter also changed a slight bit to be more correct.
@L-as L-as marked this pull request as ready for review July 15, 2024 18:07
@L-as
Copy link
Member Author

L-as commented Jul 15, 2024

I tested this with memory sanitizer and address sanitizer and don't think I saw any new errors.

It seems to work? I rebased on master and removed the commits from #11073.

Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go! Very excited to usher in the era of this build/* code being actually readable :)

@Ericson2314 Ericson2314 merged commit 846869d into NixOS:master Jul 15, 2024
11 checks passed
edolstra added a commit to DeterminateSystems/nix-src that referenced this pull request Sep 2, 2024
This broke in NixOS#11005. Any number of PathSubstitutionGoals would
be woken up by a single build slot becoming available. If there
are a lot of substitution goals active, this could lead to us
running out of file descriptors (especially on macOS where the
default limit is 256).
mergify bot pushed a commit that referenced this pull request Sep 3, 2024
This broke in #11005. Any number of PathSubstitutionGoals would
be woken up by a single build slot becoming available. If there
are a lot of substitution goals active, this could lead to us
running out of file descriptors (especially on macOS where the
default limit is 256).

(cherry picked from commit a33cb8a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-cli Relating to the "nix" command scheduling with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants