Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to analyze installer log for action run times #989

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

ehelms
Copy link
Member

@ehelms ehelms commented Oct 17, 2024

I've had this little script for a while, and wanted to stop losing track of where it is. The idea is it analyzes an installer run log file and identifies the actions that took the longest. For example, a recent analysis:

61.42:  Exec[foreman-rake-db:migrate]
39.89:  Exec[pulpcore-manager migrate --noinput]
38.11:  Service[foreman]
26.33:  Service[dynflow-sidekiq@worker-hosts-queue-1]
26.27:  Service[dynflow-sidekiq@worker-1]
25.94:  Exec[foreman-rake-db:seed]
25.57:  Service[dynflow-sidekiq@orchestrator]
7.99:  Exec[pulpcore-manager reset-admin-password --random]
7.36:  Foreman_host[foreman-ip-10-0-168-157.rhos-01.prod.psi.rdu2.redhat.com]
4.13:  Foreman_smartproxy[ip-10-0-168-157.rhos-01.prod.psi.rdu2.redhat.com]
3.27:  Package[mod_ssl]
2.9:  Service[pulpcore-content.service]
2.82:  Service[pulpcore-api.service]
2.27:  Cert[ip-10-0-168-157.rhos-01.prod.psi.rdu2.redhat.com-apache]
2.25:  Exec[pulpcore-manager collectstatic --noinput]

Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea. That made me think: we parse the report and I wonder if we should just always gather this and put at the debug level as a hook.

profile << {:task => evaltrace[:resource], :time => evaltrace[:time].to_f}
end

profile.sort! { |a,b| b[:time] <=> a[:time] }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer sort_by when I can

Suggested change
profile.sort! { |a,b| b[:time] <=> a[:time] }
profile.sort_by! { |p| p[:time] }

@ekohl
Copy link
Member

ekohl commented Oct 17, 2024

61.42: Exec[foreman-rake-db:migrate]

This runs foreman-rake twice, plus some other code.

38.11: Service[foreman]
26.33: Service[dynflow-sidekiq@worker-hosts-queue-1]
26.27: Service[dynflow-sidekiq@worker-1]
25.94: Exec[foreman-rake-db:seed]
25.57: Service[dynflow-sidekiq@orchestrator]

This all also loads the Rails environment one way or another. So that's 7 times we do something that takes at least 25 seconds. I have no idea if we can, but any time we can save there pays itself back very well.

That said, in some things the Puppet model just limits us. If there was a way to determine if something changed rather (like Ansible) than needing to run some expensive script it would speed up. More importantly. the services can start in parallel.

I've been wondering if we can define a systemd target (foreman.target?) which pulls in every service and just start that. Ideally we can also stop it to replace foreman-maintain's service command, but that's another long term thought.

@ekohl
Copy link
Member

ekohl commented Oct 17, 2024

Note that https://github.com/Shopify/bootsnap says it can work in production:

All Bootsnap features are enabled in development, test, production, and all other environments according to the configuration in the setup. At Shopify, we use this gem safely in all environments without issue.

Also:

The core Shopify platform -- a rather large monolithic application -- boots about 75% faster, dropping from around 25s to 6.5s.

We currently on have:
https://github.com/theforeman/foreman/blob/2556ac5014b2d31cc94de498f15b3271b2c3fe0e/config/boot.rb#L8

So it's not used in production.

@ehelms
Copy link
Member Author

ehelms commented Oct 18, 2024

Note that https://github.com/Shopify/bootsnap says it can work in production:

All Bootsnap features are enabled in development, test, production, and all other environments according to the configuration in the setup. At Shopify, we use this gem safely in all environments without issue.

I gave this a cursory test in production, on an already installed machine, re-running the installer but with services stopped.

Before bootsnap:

26.29:  Service[foreman]
26.14:  Service[dynflow-sidekiq@worker-1]
25.91:  Service[dynflow-sidekiq@worker-hosts-queue-1]
25.27:  Service[dynflow-sidekiq@orchestrator]
18.07:  Exec[foreman-rake-db:migrate]

After bootsnap:

17.46:  Service[foreman]
16.93:  Service[dynflow-sidekiq@worker-1]
16.83:  Service[dynflow-sidekiq@worker-hosts-queue-1]
16.75:  Service[dynflow-sidekiq@orchestrator]
10.09:  Exec[foreman-rake-db:migrate]

Bootsnap saves ~43 seconds.

For a fresh installation scenario with bootsnap, total 220 sec and 276 for non-bootsnap saving the same amount of time:

45.65:  Exec[foreman-rake-db:migrate]
40.17:  Exec[pulpcore-manager migrate --noinput]
30.0:  Service[foreman]
18.4:  Exec[foreman-rake-db:seed]
17.74:  Service[dynflow-sidekiq@worker-1]
17.67:  Service[dynflow-sidekiq@worker-hosts-queue-1]
17.27:  Service[dynflow-sidekiq@orchestrator]
8.23:  Exec[pulpcore-manager reset-admin-password --random]
6.92:  Foreman_host[foreman-ip-10-0-168-57.rhos-01.prod.psi.rdu2.redhat.com]
3.16:  Cert[ip-10-0-168-57.rhos-01.prod.psi.rdu2.redhat.com-foreman-proxy]
2.99:  Package[mod_ssl]
2.98:  Foreman_instance_host[foreman-ip-10-0-168-57.rhos-01.prod.psi.rdu2.redhat.com]
2.98:  Foreman_host[foreman-proxy-ip-10-0-168-57.rhos-01.prod.psi.rdu2.redhat.com]
2.96:  Service[pulpcore-content.service]
2.88:  Exec[postgresql_initdb_instance_main]

@ekohl
Copy link
Member

ekohl commented Oct 18, 2024

That's nice win IMHO. They do mention the cache directory isn't cleaned so let's discuss strategies for that. If it's stored in /run then it's probably fine, but I don't know about its size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants