Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

advise on KubeCPUOvercommit #731

Open
eugenetaranov opened this issue Jan 10, 2022 · 1 comment
Open

advise on KubeCPUOvercommit #731

eugenetaranov opened this issue Jan 10, 2022 · 1 comment

Comments

@eugenetaranov
Copy link

Hello,

could someone please elaborate the usage of cpu requests in KubeCPUOvercommit alarm, not limits ?
Here is the link to the related query:

sum(namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) - (sum(kube_node_status_allocatable{resource="cpu"}) - max(kube_node_status_allocatable{resource="cpu"})) > 0

According to kubernetes documentation, the pod is scheduled only if all requests can be satisfied, seems like real overcommit may occur only with limits, when some pods use more resources then they originally requested.

Thanks!

@eyenx
Copy link

eyenx commented Jan 4, 2023

The description is quite handy here:

Cluster has overcommitted CPU resource requests for Pods by {{ $value }} CPU shares and cannot tolerate node failure.',

Meaning, it calculates if there are more CPU Resources requested from all pods then all the Allocatable CPU minus the biggest node.

This alert predicts your pods won't run if a Node is down. It has nothing to do with limits as far as I understand it, it's just about being able to schedule all nodes if one of the nodes is down (example: during updating).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants