Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dyno: Refactor uAST generation #26145

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

benharsh
Copy link
Member

@benharsh benharsh commented Oct 25, 2024

Previously, generated uAST was being created given a particular type
possibly with other information about how the call was being invoked.
Generated uAST was then created as a fake 'included' module with a
parent symbol path was that of the module in which the type was defined.
This uAST was then stored and retrieved using some set/get queries in
chpl::parsing.

This PR updates the implementation to instead rely on generated
functions using the type's symbol path as its parent path. This allows
parseFileContainingIdToBuilderResult to use a generated ID's symbol
path and symbol name to query the uAST, as opposed to using the old
getter method. Now, we rely on the buildDefaultFunction query in
chpl::resolution.

For example, a default initializer will have a path like
MyMod.MyType.init. With the parent symbol path of MyMod.MyType
we can query the type's uAST if needed, and with the name init we
know which uAST-generating function to invoke (e.g. buildInitializer).

This change requires that Builders of generated code be able to support a
non-module as a top-level expression. This is achieved by disabling implicit
module creation for generated uAST. Currently it is expected that this
top-level expression will be a Function.

The recently-added compiler-generated assignment operators do not fit in
well with this new representation because the primitive types do not have an
ID associated with them. For now, it is simpler to remove those generated
operators and go back to using manually-created operators in testing. Future
work is expected to eliminate the need for such operators by using either the
standard or minimal modules.

Cleanup:

  • get rid of file paths for generated uAST
  • tidy up creation of generated IDs, moving post-order-id logic into ID.cpp
  • Simplify usage of "generatedFrom" ID in Builder

Testing:

  • test-dyno
  • full local

These do not fit well with the new way of representing generated uAST in
the frontend.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Previously, generated uAST was being created given a particular type
possibly with other information about how the call was being invoked.
Generated uAST was then created as a fake 'included' module with a
parent symbol path was that of the module in which the type was defined.
This uAST was then stored and retrieved using some set/get queries in
``chpl::parsing``.

This commit updates the implementation to instead rely on generated
functions using the type's symbol path as its parent path. This allows
``parseFileContainingIdToBuilderResult`` to use a generated ID's symbol
path and symbol name to query the uAST, as opposed to using the old
getter method. Now, we rely on the ``buildDefaultFunction`` query in
``chpl::resolution``.

For example, a default initializer will have a path like
``MyMod.MyType.init``. With the parent symbol path of ``MyMod.MyType``
we can query the type's uAST if needed, and with the name ``init`` we
know which uAST-generating function to invoke. This allows us to get rid
of the clunky 'chpl__generated' prefix to generated modules. We also now
need to avoid generating an implicit module for generated uAST and
instead have a Function as the top-level expression in a Builder.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
…stem

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
This commit updates buildTypeConstructor to build a type constructor
given only a type's ID.

Now that the untyped signature is not pointing to non-existent uAST, we
can also stop using the original type's ID in the untyped signature.
This requires some minor modifications to return-type-inference now that
'idIsFunction' is returning true for type constructors.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Parent symbol IDs for generated functions now refer to actual
non-generated uAST, meaning that the returned ID should not itself be
considered 'generated'.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
This commit updates Builder to store a boolean indicating whether the
uAST is generated rather than storing the old 'generatedFrom' ID. The
necessary information from this ID can and should be stored in the
pre-existing ``startingSymbolPath_`` field.

After some experimentation it turns out that BuilderResult no longer
needs a ``generatedFrom`` ID to use for the id-to-parent map. This is
because we are now correctly setting the parent path for generated IDs
in ``parseFileContainingIdToBuilderResult``.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
@@ -142,7 +156,8 @@ ID ID::parentSymbolId(Context* context) const {
}

if (this->isFabricatedId() &&
this->fabricatedIdKind() == FabricatedIdKind::Generated) {
this->fabricatedIdKind() == FabricatedIdKind::Generated &&
!isSymbolDefiningScope()) {
Copy link
Member

@mppf mppf Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this assumes that it's not possible to have a generated ID within another generated ID. That's probably fine, but needs to be documented as a constraint somewhere. Potentially as a CHPL_ASSERT somewhere. Or as a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the assumption is that generated uAST will not contain nested scope-defining symbols. I've pushed a commit containing both comments and an assertion to enforce this.

if (postOrderId == -1) {
pid = ID_GEN_START;
} else {
pid = ID_GEN_START - 1 - postOrderId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a check that we're packing and unpacking these correctly in to testIds.cpp ? by calling generatedId and checking that postOrderId comes out right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! I had introduced a bug in a previous PR that is now fixed.

/**
Given a type's symbol path and the name of a default function, return a
BuilderResult storing the generated default function uAST.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting. At first I would think that this function should return a Function* ? I see that it returns a BuilderResult* to integrate into parseFileContainingIdToBuilderResult. IMO it might be worth renaming the function to emphasize this -- but at least it would be good to put a comment explaining that this is intended to be called by parsing queries that need to handle a BuilderResult* for generality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to builderResultForDefaultFunction.

const BuilderResult*
buildDefaultFunction(Context* context,
UniqueString typePath,
UniqueString name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this function and all of the functions implementing it should be stored in default-functions.cpp. I know that is slightly out of pattern (since the prototype is in resolution-queries.h) but that can be addressed with a comment here or by moving these to a new public header (so that we have a public chpl/include/default-functions.h and a private one in lib/resolution/ with a different name).

Alternatively, you can keep buildDefaultFunction but move the various functions implementing it to default-functions.cpp.

Either way, I think it's important to move the helper functions implementing buildDefaultFunction to default-functions.cpp both for find-ability and because resolution-queries.cpp can get overly large because of it's grab-bag nature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved builderResultForDefaultFunction and buildTypeConstructor into default-functions.cpp.

Firstly, fixes a bug in parentSymbolId where getting the parentSymbolId
of a non-scope-defining symbol (e.g. an Identifier) would incorrectly
return a generated ID with the grandparent symbol path. This does not
seem to be exposed in any testing at this time.

Adds some comments explaining the assumption that there is only one
scope-defining symbol in generated uAST. Also adds an assertion to
enforce this rule, which was convenient to add in doAssignIds.

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
…arity

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
…ult-functions.cpp

Signed-off-by: Ben Harshbarger <ben.harshb@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants