In a modern world, we expect webapps to always be available. Yet, deployments trigger downtimes, which is why we're talking about zero downtime.
Yeah, it's easy to put a proxy in front of your services and let it balance across multiple instances of your service. Now you can take them down one at a time, and deploy your fancy new version.
But how would you handle long running requests which are directly connected to a specific instance?
This project is a showcase for an example of how to gracefully shut down a Spring Boot app. The code is more or less a plain copy from spring-projects/spring-boot#4657 (comment) with the only difference that I wanted to use Kotlin instead of standard Java.
I also wrote a more detailed article about our motivation for this example demo app at my blog.
Clone (or download) this repository:
git clone https://github.com/gesellix/graceful-shutdown-spring-boot
cd graceful-shutdown-spring-boot
You'll need a Java Runtime Environment (JRE) to run the example without Docker.
- run the app, e.g. via
./gradlew bootRun
- perform a looooong download:
curl -X GET "http://localhost:8080/endless" > /dev/null
- send
SIGTERM
to the Spring process (you can find the<pid>
in the application logs):kill <pid>
Using Docker Stack or Docker Services, you'll need to be aware of the underlying mechanics of a rolling update. You can find a detailed discussion about that topic at github.com/moby/moby/issues/30321.
In a Spring Boot world, you'll need to tweak the following settings:
- Your Spring Boot webapp needs to register a shutdown hook to pause/prevent the JVM shutdown.
The example application shows how an implementation looks like in an Apache Tomcat environment.
You can tweak the shutdown timeout with the example application property
catalina.threadpool.execution.timeout.seconds
. - During a service update Docker will send the
TERM
signal to your container and wait for 10 seconds until Docker sends theKILL
signal to finally stop your container. If you want your downloads to keep running longer than 10 seconds, you'll need to configure thestop-grace-period
to match your needs. Since the Spring Boot app defaults to wait 30s, I chose 60 seconds asservices.app.stop_grace_period
in thestack.yml
. - Your replicated service tasks shouldn't be shutdown simultaneously. That's why the
stack.yml
configures theservices.app.deploy.update_config.parallelism
to be sequentially. Additionally, I configured theservices.app.deploy.update_config.delay
to be 60s.
Now the exciting part: you may want to update your app to use a fresh image or add some environment parameter. Such updates would trigger a short downtime of your running containers, which effectively would stop your running download. Depending on your needs, that's what you actually want. But maybe you would like to give running downloads a chance to finish. The grace period needs to be configurable, because only you know how long a typical download should be kept running in your individual service.
Deploy a minimal stack with reverse proxy and two instances of the example app:
docker swarm init # you may add more worker nodes, but that's not necessary for the demo.
docker stack deploy -c stack.yml grace
We can now start the demo scenario by first starting a download and then trying to update the app service.
Open your browser at http://localhost and click go for it, or
use your shell to perform an endless download: curl -X GET "http://localhost/endless" > /dev/null
.
You can watch your service logs with docker service logs -f grace_app
. The browser will also show
you the increasing number of downloaded bytes.
An update with task downtime can easily be triggered:
docker service update --env-add "foo=bar" --detach=false grace_app
Now keep your eye on the browser download "stats" and the service logs where you can see the Docker service
update shutting down your tasks one by one. We also use the detach option --detach=false
to watch the service
update states in real-time from the cli. One of your tasks will be shut down instantly, because it won't be
"locked" by your download. But the other task shutdown should be prevented to approximately 30 seconds.
Only then the Spring Boot shutdown will continue and Docker won't enforce the shutdown through SIGKILL
.
Long story short: you should catch the TERM
signal to prevent an instant process shutdown, Docker needs
to wait long enough before actually killing your process, and you should configure the parallelism
with a value smaller than your number of replicas.
The GracefulShutdown
class listens to application events of type ContextClosedEvent
. It waits 30 seconds
(or whatever you configured) for the Tomcat ThreadPoolExecutor
to be shut down. Essentially,
it blocks the JVM shutdown to wait for the Tomcat to be finished with pending requests.
Please note that you won't get any guarantees that the JVM will wait endlessly for your shutdown hook to return. It might interrupt your shutdown hook without further notice. My JVM implementation on macOS and the one in an alpine container seem to be patient enough :)
Please note that there are two additional Spring Boot info contributors, whose results will be visible at /info:
- Docker Info contributor: provides the output of the
docker info
. Expects the Docker engine to be available at/var/run/docker.sock
. - Docker Secrets contributor: dumps the configured secrets and their contents. See the official docs for details. A default example secret is already configured for your convenience.