-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Reduce logging level in sapphire-localnet Docker image #513
feat: Reduce logging level in sapphire-localnet Docker image #513
Conversation
29c114a
to
b7b5e07
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #513 +/- ##
=======================================
Coverage 62.31% 62.31%
=======================================
Files 38 38
Lines 3962 3962
=======================================
Hits 2469 2469
Misses 1285 1285
Partials 208 208 ☔ View full report in Codecov by Sentry. |
docker/common/start.sh
Outdated
@@ -67,7 +67,7 @@ fi | |||
T_START="$(date +%s)" | |||
|
|||
notice "Starting oasis-net-runner with ${CYAN}${PARATIME_NAME}${OFF}...\n" | |||
/spinup-oasis-stack.sh --log.level info 2>1 &>/var/log/spinup-oasis-stack.log & | |||
/spinup-oasis-stack.sh --log.level warn 2>1 &>/var/log/spinup-oasis-stack.log & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be made configurable via an env var? Debug logs are really useful when investigating issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I'll add that :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd like at least info
by default? I'd actually prefer debug, as i'm one of those that gets to investigate issues from time to time, and not having debug is kinda pain/useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The motivation behind this change was to reduce disk and CPU usage of the container, so that developers don't need to buy a 1TB SSD just to store some logs that they're not going to look at 99.9% of the time :)
If there's an issue, the env var that I'm going to add soon can be changed to debug
and the developers can re-run their stuff and look at the logs.
Is this container used elsewhere where this kind of process isn't suitable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
that developers don't need to buy a 1TB SSD just to store some logs that they're not going to look at 99.9% of the time
Yeah so if this is the main/only focus then logrotating the logs would work as well. Lets see what the numbers are with debug/info/war logs. Interesting to see both the absolute usage of logs, and % to the rest of container disk usage.
My thinking is that if with info
one is able to run the container, for lets say two weeks, without logs consuming to much disk space (both in % and absolute wise) then I'm fine with setting the default to info
and leaving it as it is, otherwise we should likely additionally implement some kind of logrotate to limit the log sizes.
Is this container used elsewhere where this kind of process isn't suitable?
Probably not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. I will make some measurements and we can decide then what to do :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Do we have some numbers for this? Whats the approximate/estimated disk space usage for the container after 1 hour (or 10 minutes) in different settings, e.g. warn/info/debug? Does this also reduce CPU usage, I thought reducing CPU usage was the initial motivation for this, but I kinda doubted it would be noticeable, although I could be wrong. If the only reason for doing this is reducing disk usage, we could also think about truncating logs (via logrotate or something). |
Not yet, but I am planning to measure it. |
b7b5e07
to
a091242
Compare
docker/sapphire-localnet/Dockerfile
Outdated
@@ -8,11 +8,15 @@ RUN cd oasis-web3-gateway && make && strip -S -x oasis-web3-gateway docker/commo | |||
FROM ghcr.io/oasisprotocol/oasis-core-dev:stable-23.0.x AS oasis-core-dev | |||
|
|||
ENV OASIS_UNSAFE_SKIP_KM_POLICY=1 | |||
ENV OASIS_CORE_VERSION=master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to backport the oasis-core fix into stable/23.0.x
branch now (even if there's no release yet), and build oasis-core from there? Since master could technically get breaking changes merged in which could cause things to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be the best solution for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in oasisprotocol/oasis-core#5547 -- will change the branch now.
With the latest sapphire-localnet container with debug logging it's using approx 8mb per minute of disk space for logging, it's slowly thrashing my SSDs. That's about 11gb per day... 300gb/month etc. This makes sense as I've ran out of space on my docker partition a couple of times while the container was left running for a couple of weeks. There's a lot of node gossip being logged, meaning it's very difficult to extract from the logs anything of relevance when something weird happens. |
a091242
to
6b7663c
Compare
Yeah, lets default to info or warn, depending on what the difference in usage between info and warn is. |
As promised, the graphs (this is just for an idling I'd prefer to set |
Ok agreed, thanks for the reports! |
6b7663c
to
a2f603a
Compare
This PR adds the following changes to the
sapphire-localnet
Docker image:oasis-node
andoasis-net-runner
from themaster
branch (needed until oasisprotocol/oasis-core@522aeda ends up in a stable release).simple-keymanager
binary from 5MB to 3MB (this is accomplished by tweaking Rust build options).OASIS_NODE_LOG_LEVEL
).debug
towarn
, which should result in a much much lower disk consumption.TODO:
debug
,info
,warn
).stable/23.0.x
branch.