-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Snapstart before checkPoint not getting called #184
Comments
Thanks for reporting this. Lambda Web Adapter has not implemented SnapStart runtime hooks at this moment. The adapter and the SpringBoot application are two separate processes. The in-process runtime hooks design won't work in this case. We are thinking about adding two http calls to trigger runtime hooks in the web application process. For example, before snapshot, the adapter sends a POST request to /checkpoint path on the web app, and sends another POST request to /resume path after resume. These two requests will be the runtime hooks for the web application. You could change the actual paths via configuration. Do you think this makes sense? |
@bnusunny First of all thanks for this excellent tooling to deploy frameworks to lambda. Having said that I'm a bit confused why the |
Your assumptions are correct. That's what I'm proposing. The SnapStart Runtime Hooks are actually implemented by Java Runtime and expose as CRaC api to developers. This works for normal Java functions because the function code is actually running inside the Java Runtime process. LWA is an extension and also a custom runtime process. LWA could receive signals for runtime hooks. But since the web application is not running within LWA process, it is not possible to trigger CRaC api in another process without sending a request to it. I could also provide a Java package which expose the two APIs and trigger the CRaC hooks for you. All you need to do is to include this package. Would this help? |
I could expose these two APIs over a Unix domain socket and make it more secure for IPC. |
Definitely a Java package will help from a developer's view point. But won't it increase your scope of things to handle. You will have to plan implementations for almost all the network bound solutions supported by frameworks. My understanding of an extension was that it can have a separate runtime but the runtime on which my code will run is still the AWS managed java runtime (In the console also it is shown like that). In that case can't the extension delegate the task of checkpointing to Java runtime on which my code runs. I have to admit that my understanding of extensions are very limited and what I say may be completely wrong. It would be great if you can guide me to a good write up on extensions and also help me understand why adding this extension altered the behaviour of Java runtime. |
http requests can expose CraC to outside world. If you are taking the IPC path then Unix domain socket is better than http. Not sure if there is an even better way of resolving this as I have limited understanding of overall setup. |
Lambda Web Adapter Layer contains two files: one is To read more about wrapper script, checkout the Lambda Developer Guide here. |
I think having the web adapter call /checkpoint and /restore would be a good compromise. I think it would make sense to have a library that would take the incoming /checkpoint and /restore requests and turn them into the standard CraC API calls. That way, apps that use the CraC API can continue to function. This could be done in two phases. Early adopters can listen to /checkpoint and /restore while we wait for the frameworks to potentially add support for bridging between the URIs and the CraC APIs. If you do go down the path of posting into /checkpoint and /restore, then I think we would need a way for the adapter to block those requests from coming in. I wouldn't want an external entity to be able to close my DB connections over and over again. |
We're making progress on this issue and have finalized a plan to implement runtime hooks. Here's an overview of the planned solution:
Our team is actively working on the implementation of these solutions. We will continue to provide updates on this issue as we progress. |
Thanks for the update. Also can you clarify this. You mentioned:
But the lambda extension documentation says that internal extensions ( |
The bootstrap script uses exec -- "${LAMBDA_TASK_ROOT}/${_HANDLER}" |
In that case a |
I have a spring boot application (v2.7.2) which connects to a database using Hikari connection pool. When deployed with aws serverless java container and snapstart, the beforecheckpoint function of the application is correctly called to evict connections from the pool. But the same is not happening when deployed with web adapter as given in the example springboot-zip. Looks like there is some issue with snapstart when used with adapter, but couldn't find it documented anywhere. Pretty sure that my CraC configuration for checkpointing is right because it works with jaws-serverless-java-container
The text was updated successfully, but these errors were encountered: