advise on KubeCPUOvercommit #731

eugenetaranov · 2022-01-10T21:04:56Z

Hello,

could someone please elaborate the usage of cpu requests in KubeCPUOvercommit alarm, not limits ?
Here is the link to the related query:

kubernetes-mixin/alerts/resource_alerts.libsonnet

Line 28 in 9821d07

    
                         sum(namespace_cpu:kube_pod_container_resource_requests:sum{%(ignoringOverprovisionedWorkloadSelector)s}) - (sum(kube_node_status_allocatable{resource="cpu"}) - max(kube_node_status_allocatable{resource="cpu"})) > 0

According to kubernetes documentation, the pod is scheduled only if all requests can be satisfied, seems like real overcommit may occur only with limits, when some pods use more resources then they originally requested.

Thanks!

eyenx · 2023-01-04T15:44:26Z

The description is quite handy here:

Cluster has overcommitted CPU resource requests for Pods by {{ $value }} CPU shares and cannot tolerate node failure.',

Meaning, it calculates if there are more CPU Resources requested from all pods then all the Allocatable CPU minus the biggest node.

This alert predicts your pods won't run if a Node is down. It has nothing to do with limits as far as I understand it, it's just about being able to schedule all nodes if one of the nodes is down (example: during updating).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

advise on KubeCPUOvercommit #731

advise on KubeCPUOvercommit #731

eugenetaranov commented Jan 10, 2022

eyenx commented Jan 4, 2023

advise on KubeCPUOvercommit #731

advise on KubeCPUOvercommit #731

Comments

eugenetaranov commented Jan 10, 2022

eyenx commented Jan 4, 2023