Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean shutdown #60

Open
Vynce opened this issue Jan 6, 2020 · 2 comments
Open

Clean shutdown #60

Vynce opened this issue Jan 6, 2020 · 2 comments
Assignees

Comments

@Vynce
Copy link

Vynce commented Jan 6, 2020

The Unifi Controller doesn't appear to get shut down cleanly and I'm concerned about database corruption (like happens on the 1st gen cloudkey).

It looks like docker-compose just sends a SIGTERM followed by a SIGKILL 10 seconds later:

$ cat /var/log/syslog

Jan  5 01:57:15 unifi systemd[1]: Stopping Unifi Controller...
Jan  5 01:57:16 unifi docker-compose[6503]: Stopping unifi ...
Jan  5 01:57:26 unifi dockerd[393]: time="2020-01-05T01:57:26.250480054Z" level=info msg="Container 33380a717b98da76c73954c9022a635b6a8897bfec930b6c31c976875b25a402 failed to exit within 10 seconds of signal 15 - using the force"
Jan  5 01:57:26 unifi containerd[390]: time="2020-01-05T01:57:26.656711911Z" level=info msg="shim reaped" id=33380a717b98da76c73954c9022a635b6a8897bfec930b6c31c976875b25a402
Jan  5 01:57:26 unifi dockerd[393]: time="2020-01-05T01:57:26.668748233Z" level=info msg="ignoring event" module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"
Jan  5 01:57:26 unifi systemd[732]: var-lib-docker-containers-33380a717b98da76c73954c9022a635b6a8897bfec930b6c31c976875b25a402-mounts-shm.mount: Succeeded.
Jan  5 01:57:26 unifi systemd[1]: var-lib-docker-containers-33380a717b98da76c73954c9022a635b6a8897bfec930b6c31c976875b25a402-mounts-shm.mount: Succeeded.
Jan  5 01:57:26 unifi systemd[732]: var-lib-docker-overlay2-abd455ad565f14aec98c22906042cb0dcbeb5a3e02de6754169849e78c956605-merged.mount: Succeeded.
Jan  5 01:57:26 unifi systemd[1]: var-lib-docker-overlay2-abd455ad565f14aec98c22906042cb0dcbeb5a3e02de6754169849e78c956605-merged.mount: Succeeded.
Jan  5 01:57:26 unifi docker-compose[6267]: #033[36munifi exited with code 137
Jan  5 01:57:27 unifi docker-compose[6503]: #033[1A#033[2K#015Stopping unifi ... #033[32mdone#033[0m#015#033[1B
Jan  5 01:57:27 unifi docker-compose[6267]: #033[0mGracefully stopping... (press Ctrl+C again to force)
Jan  5 01:57:27 unifi systemd[1]: unifi.service: Succeeded.
Jan  5 01:57:27 unifi systemd[1]: Stopped Unifi Controller.

The upstream unifi init script goes through a multi-phase shutdown process, starting with jsvc, then creating /var/run/unifi/server.stop if jsvc fails, then finally killing ace.jar and telling mongodb to shut down. I've attached unifi.init (extracted from the .deb) since I can't find it anywhere else online.

This project does it a slightly different way:
https://github.com/jacobalberty/unifi-docker#multi-process-container
https://github.com/jacobalberty/unifi-docker/blob/master/docker-entrypoint.sh#L9

Everything does exit cleanly if I run:

docker-compose exec unifi /usr/bin/java -Xmx512M -jar /usr/lib/unifi/lib/ace.jar stop
@ryansch ryansch self-assigned this Jan 6, 2020
@ryansch
Copy link
Owner

ryansch commented Jan 6, 2020

That's pretty serious. I'll look into this the next time I have some free time.

We intentionally don't use jsvc as we don't need the things it gives us inside of docker.

@Vynce
Copy link
Author

Vynce commented Jan 20, 2020

This may be less serious than it seemed. Looking at /mnt/cache/appdata/unifi/logs/mongod.log inside the container, mongod does appear to listen for SIGTERM. It might be nice for the whole app to shut down cleanly, but AFAICT the database already does and that's the main thing that matters.

One small remaining concern is if mongod happens to take longer than 10s to shut down in some cases and gets killed instead. The log below is from an empty database that's not really being used. It only took 6ms to shut down. I also looked at another database under light load (2 APs and a couple dozen clients) and it took 200ms to shut down. Seems unlikely to be an issue for home use.

Mon Jan 13 02:12:31.935 [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
Mon Jan 13 02:12:31.935 [signalProcessingThread] now exiting
Mon Jan 13 02:12:31.935 dbexit:
Mon Jan 13 02:12:31.935 [signalProcessingThread] shutdown: going to close listening sockets...
Mon Jan 13 02:12:31.935 [signalProcessingThread] closing listening socket: 7
Mon Jan 13 02:12:31.935 [signalProcessingThread] closing listening socket: 8
Mon Jan 13 02:12:31.935 [signalProcessingThread] removing socket file: /usr/lib/unifi/run/mongodb-27117.sock
Mon Jan 13 02:12:31.935 [signalProcessingThread] shutdown: going to flush diaglog...
Mon Jan 13 02:12:31.935 [signalProcessingThread] shutdown: going to close sockets...
Mon Jan 13 02:12:31.935 [signalProcessingThread] shutdown: waiting for fs preallocator...
Mon Jan 13 02:12:31.935 [signalProcessingThread] shutdown: closing all files...
Mon Jan 13 02:12:31.937 [conn6] end connection 127.0.0.1:36132 (14 connections now open)
Mon Jan 13 02:12:31.937 [conn9] end connection 127.0.0.1:36140 (14 connections now open)
Mon Jan 13 02:12:31.937 [conn7] end connection 127.0.0.1:36134 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn5] end connection 127.0.0.1:36128 (14 connections now open)
Mon Jan 13 02:12:31.937 [conn13] end connection 127.0.0.1:36150 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn4] end connection 127.0.0.1:36126 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn8] end connection 127.0.0.1:36138 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn15] end connection 127.0.0.1:36176 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn10] end connection 127.0.0.1:36142 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn11] end connection 127.0.0.1:36144 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn12] end connection 127.0.0.1:36148 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn2] end connection 127.0.0.1:36122 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn3] end connection 127.0.0.1:36124 (14 connections now open)
Mon Jan 13 02:12:31.938 [conn1] end connection 127.0.0.1:36120 (14 connections now open)
Mon Jan 13 02:12:31.939 [conn14] end connection 127.0.0.1:36152 (14 connections now open)
Mon Jan 13 02:12:31.941 [signalProcessingThread] closeAllFiles() finished
Mon Jan 13 02:12:31.941 [signalProcessingThread] shutdown: removing fs lock...
Mon Jan 13 02:12:31.941 dbexit: really exiting now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants