Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add updateConfiguration function to jsg Isolates #3065

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

danlapid
Copy link
Contributor

@danlapid danlapid commented Nov 6, 2024

This can be used to update the given configuration at runtime.
Note that while some jsg structs are lazily using the configuration, others can use it at construction and will have the original configuration value.

This is another step towards Python Isolate Pools and is coming in direct continuation to #2999.
This is another part split from workerd #2936 and internal #9006 for reviewability.

This can be used to update the given configuration at runtime. Note that
while some jsg structs are lazily using the configuration, others can
use it at construction and will have the original configuration value.
@danlapid danlapid force-pushed the dlapid/update_jsg_configuration branch from 3a697f2 to 23172a2 Compare November 7, 2024 11:43
@danlapid
Copy link
Contributor Author

danlapid commented Nov 8, 2024

ping @jasnell or anyone else who feels like reviewing this 😄

jasnell
jasnell previously approved these changes Nov 9, 2024
Copy link
Member

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I'd like @kentonv to sign off as well in case there's some edge case I'm not thinking of.

@danlapid
Copy link
Contributor Author

danlapid commented Nov 9, 2024

@kentonv you've been pretty involved in the design phase but I don't think you've seen the PRs so far.
Linked above is a previous PR that disabled memoization for the v8 templates created on the first context (the one where emscripten and pyodide are instantiated).
That is the only place I found where previous version of the JSG configuration is used before I set the request specific feature flags.
After receiving the request I use this function to update the configuration so that any subsequent calls to various functions use the new configuration.
The way I tested this was I changed the internal repo so that every jsg isolated is created with an empty configuration and then I called this function to update the various places where the configuration is saved then the let the worker isolate instantiation continue and ran all tests with this configuration.
In addition a second round of tests where I instantiated emscripten and pyodide in a separate context instantiated before any feature flags were set and then continued to instantiate user code in the main context created with the request specific user flags and all made sure all Python tests pass.

@kentonv
Copy link
Member

kentonv commented Nov 14, 2024

(Sorry for delay, it turns out my mail filters weren't properly labeling threads were I was mentioned.)

Oof, this approach seems quite precarious.

I take it the zygote really does depend on using ServiceWorkerGlobalScope as the context type? It can't run in a more generic isolate?

I believe that skipping memoization will break type checking. That is, if an API function takes jsg::Ref<SomeType> as a parameter, you won't be able to invoke said function, because even if you pass it a valid instance of SameType, the type check will fail, because it won't have a memoized template to check it against.

Did you try the approach of creating a separate TypeWrapper instance for the zygote? That seems less hacky especially if it means you can defer creation of the final TypeWraper until you know the config, and thus never need to swap out the config on an existing object...

@danlapid
Copy link
Contributor Author

(Sorry for delay, it turns out my mail filters weren't properly labeling threads were I was mentioned.)

Oof, this approach seems quite precarious.

I take it the zygote really does depend on using ServiceWorkerGlobalScope as the context type? It can't run in a more generic isolate?

I believe that skipping memoization will break type checking. That is, if an API function takes jsg::Ref<SomeType> as a parameter, you won't be able to invoke said function, because even if you pass it a valid instance of SameType, the type check will fail, because it won't have a memoized template to check it against.

Did you try the approach of creating a separate TypeWrapper instance for the zygote? That seems less hacky especially if it means you can defer creation of the final TypeWraper until you know the config, and thus never need to swap out the config on an existing object...

We could create another context type, but it would either mean a lot of code duplication (which can be prettified with template params but that's also ugly) or it wouldn't help with memoization.
For example, we need TextEncoder, if we have another global context with TextEncoder as a nested type then it'll be memoized just as well.
We could skip the wrapper all together by just attaching all the APIs to the global context ourselves but that seems rather boilerplatey as well.

I haven't tried using a separate TypeWrapper, it seemed to me like there's a lot more room for error in that approach. A lot of the methods of jsg::Isolate depend on having one wrapper instance, if we add another TypeWrapper we'd probably need to generalize that class so that it uses the correct wrapper per context I guess.

Of course I could be missing an easier avenue to make this work that you're seeing and I'd love to hear more.

What I'm really trying to do here is use as much of jsg as possible and enjoy as much of our existing APIs while not refactoring massive core parts of our codebase that might have side effects for regular isolates.
That being said, I wouldn't want it to be unstable either, I definitely see what you mean regarding type checks in the un-memoized context.

What in your opinion and experience is the best approach here? is there an alternate solution that's clear to you that I have not pointed out?

@kentonv
Copy link
Member

kentonv commented Nov 14, 2024

I haven't spent time investigating solutions so I only have ideas, don't know what will actually pan out.

But not memoizing seems too fragile and inefficient to me. In addition to the type checks failing, you will be constructing new templates for every instance of a type, rather than once per type. Seems like that could cause a lot of unexpected inefficiency if you have, like, a loop calling some API that returns objects somewhere.

To restore memoization, it seems like you need one of:
a) Use a separate TypeWrapper which maintains its own memizations.
b) Create a way to clear the existing memoizations if and when the config is rewritten. (Though this may still be weird as code running in the zygote context may then begin to see inconsistent config... newly-created objects will have the compat flags of the app, whereas old objects will have the original compat flags, and also the old objects will no longer pass type checks when passed as arguments to functions...)

@danlapid danlapid marked this pull request as draft November 14, 2024 20:25
@danlapid
Copy link
Contributor Author

Converting back to draft until I've got a better solution.
Thanks for the review everyone!

@jasnell jasnell dismissed their stale review November 14, 2024 21:59

Dismissing pending updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants