Prefer kube-scheduler's resource metrics to kube-state-metrics' #815

rexagod · 2023-01-10T13:02:13Z

Use kube-scheduler's metrics instead of kube-state-metrics, as they are more precise.

Refer the links below for more details.

Also, refactor kube_pod_status_phase, since statuses other than "Pending" or "Running" are excluded or deprecated.

rules/apps.libsonnet

paulfantom

As far as I like this change I can see a problem with this in managed solutions (like EKS) where access to kube-scheduler is forbidden. In those cases alerts and dashboards which are based on kube-scheduler data won't be useful at all. Due to such an issue can we use OR statements instead of deprecating kube-state-metrics data? I think something like kube_pod_resource_request OR kube_pod_container_resource_request should do the trick.

rexagod · 2023-02-14T07:55:22Z

Note to self: Revisit this issue once prometheus/prometheus#9624 is implemented.

rexagod · 2023-02-14T10:56:49Z

Just wondering if there's a better (or any) way to format embedded PromQL expressions in *sonnet files? The existing tooling seems to ignore the PromQL expressions within such files.

github-actions · 2024-09-15T00:26:37Z

This PR has been automatically marked as stale because it has not
had any activity in the past 30 days.

The next time this stale check runs, the stale label will be
removed if there is new activity. The issue will be closed in 7
days if there is no new activity.

Thank you for your contributions!

rexagod · 2024-09-23T11:07:22Z

Rebasing.

Since they are more accurate.

Refactor kube_pod_status_phase, since statuses other than "Pending" or "Running" are excluded or deprecated. Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

skl · 2024-09-24T14:05:34Z

rules/apps.libsonnet

There seem to be quite a few breaking changes here, by removing recording rules and creating new ones (as a result of changing the rule names). That would affect any queries/dashboards which have been written outside of this repo (i.e. downstream projects).

I would suggest either of the below to maintain compatibility:

keeping the existing rules and creating new rules with the desired changes, or:

keeping the existing rule names and updating the rule expressions with the changes

What do you think?

Added the recording rules back in, makes sense. PTAL. :)

Yeah, this feels better to me, thanks! Makes it easier to discuss the new metrics, too.

I'd vote for creating new recording rules based on kube-scheduler metrics. for instance

namespace_memory:kube_pod_container_resource_requests:sum would continue to work on the kube-state-metrics kube_pod_container_resource_requests metric.

there would be a new namespace_memory:kube_pod_resource_request:sum recording rule for the kube-scheduler kube_pod_resource_request metric.

wherever we use namespace_memory:kube_pod_container_resource_requests:sum today, it woud be changed to

namespace_memory:kube_pod_resource_request:sum or namespace_memory:kube_pod_container_resource_requests:sum

It would work whichever of the kube-state-metrics metrics and the kube-scheduler metrics are present while being efficient. Ideally there could config options in the mixin to opt-out for kube-state-metrics or kube-scheduler recording metrics.

simonpasquier · 2024-09-24T14:14:33Z

rules/apps.libsonnet

-                        kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s}
-                      ) * on(namespace, pod, %(clusterLabel)s) group_left() max by (namespace, pod, %(clusterLabel)s) (
-                        kube_pod_status_phase{phase=~"Pending|Running"} == 1
+                        kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s}


I'm not quite sure that it will do the right thing since kube_pod_resource_request and kube_pod_container_resource_requests don't have the exact same labels if I understand correctly.

I could be wrong here but wouldn't the differing labelsets (kube-scheduler metrics potentially having the additional scheduler and priority labels) after or be sanitized to the set of specified labels in the max operation, so doing an ignoring (scheduler,priority) will eventually have no effect on the final result?

…metrics' reintroduce dropped rules

skl · 2024-09-25T13:27:53Z

rules/apps.libsonnet

+          {
+            record: 'cluster:namespace:pod_memory:active:kube_pod_resource_request_or_kube_pod_container_resource_requests',
+            expr: |||
+              (kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s})


As kube_pod_container_resource_requests is container-level and kube_pod_resource_request is not, is it worth aggregating kube_pod_container_resource_requests up to the pod-level so that the labels for the these rules always have consistent labelsets?

So, for example, maybe wrapping the whole expression in a sum by (%(clusterLabel)s, %(namespaceLabel)s, pod, node) (...)?

I wonder if that would make the data a bit easier to work with, regardless of whether you have scheduler or KSM data.

I don't think that this rule should be changed: as you said, kube-scheduler metrics have a pod-level granularity while this recorded metric is per-container.

skl · 2024-09-25T13:29:04Z

rules/apps.libsonnet

+              sum by (namespace, %(clusterLabel)s) (
+                  sum by (namespace, pod, %(clusterLabel)s) (
+                      max by (namespace, pod, container, %(clusterLabel)s) (
+                        kube_pod_container_resource_limits{resource="memory",%(kubeStateMetricsSelector)s}


Think you maybe intended to or the kube_pod_resource_limit metric here?

skl · 2024-09-25T13:29:42Z

rules/apps.libsonnet

+          {
+            record: 'cluster:namespace:pod_cpu:active:kube_pod_resource_limit_or_kube_pod_container_resource_limits',
+            expr: |||
+              (kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s})


Suggested change

(kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s})

(kube_pod_resource_limit{resource="cpu",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s})

skl · 2024-09-25T13:29:52Z

rules/apps.libsonnet

+              sum by (namespace, %(clusterLabel)s) (
+                  sum by (namespace, pod, %(clusterLabel)s) (
+                      max by (namespace, pod, container, %(clusterLabel)s) (
+                        kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s}


Suggested change

kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s}

kube_pod_resource_limit{resource="cpu",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s}

simonpasquier · 2024-09-27T09:48:13Z

rules/apps.libsonnet

+          {
+            record: 'cluster:namespace:pod_memory:active:kube_pod_resource_request_or_kube_pod_container_resource_requests',
+            expr: |||
+              (kube_pod_resource_request{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_requests{resource="memory",%(kubeStateMetricsSelector)s})


I don't think that this rule should be changed: as you said, kube-scheduler metrics have a pod-level granularity while this recorded metric is per-container.

simonpasquier · 2024-09-27T09:48:43Z

DESIGN.md

@@ -54,7 +54,7 @@ Jsonnet offers the ability to parameterise configuration, allowing for basic cus
          alert: "KubePodNotReady", 
          expr: ||| 
            sum by (namespace, pod) ( 
-              kube_pod_status_phase{%(kubeStateMetricsSelector)s, phase!~"Running|Succeeded"} 
+              kube_pod_status_phase{%(kubeStateMetricsSelector)s, phase!~"Running"} 


not sure about this change, is it needed?

simonpasquier · 2024-09-27T09:48:49Z

alerts/apps_alerts.libsonnet

@@ -33,7 +33,7 @@
            expr: |||
              sum by (namespace, pod, %(clusterLabel)s) (
                max by(namespace, pod, %(clusterLabel)s) (
-                  kube_pod_status_phase{%(prefixedNamespaceSelector)s%(kubeStateMetricsSelector)s, phase=~"Pending|Unknown|Failed"}
+                  kube_pod_status_phase{%(prefixedNamespaceSelector)s%(kubeStateMetricsSelector)s, phase="Pending"}


simonpasquier · 2024-09-27T09:57:11Z

rules/apps.libsonnet

I'd vote for creating new recording rules based on kube-scheduler metrics. for instance

namespace_memory:kube_pod_container_resource_requests:sum would continue to work on the kube-state-metrics kube_pod_container_resource_requests metric.

there would be a new namespace_memory:kube_pod_resource_request:sum recording rule for the kube-scheduler kube_pod_resource_request metric.

wherever we use namespace_memory:kube_pod_container_resource_requests:sum today, it woud be changed to

namespace_memory:kube_pod_resource_request:sum or namespace_memory:kube_pod_container_resource_requests:sum

It would work whichever of the kube-state-metrics metrics and the kube-scheduler metrics are present while being efficient. Ideally there could config options in the mixin to opt-out for kube-state-metrics or kube-scheduler recording metrics.

github-actions · 2024-10-28T00:26:19Z

This PR has been automatically marked as stale because it has not
had any activity in the past 30 days.

The next time this stale check runs, the stale label will be
removed if there is new activity. The issue will be closed in 7
days if there is no new activity.

Thank you for your contributions!

simonpasquier reviewed Jan 10, 2023

View reviewed changes

rules/apps.libsonnet Outdated Show resolved Hide resolved

simonpasquier reviewed Jan 10, 2023

View reviewed changes

rules/apps.libsonnet Outdated Show resolved Hide resolved

rules/apps.libsonnet Outdated Show resolved Hide resolved

rexagod force-pushed the mon-2823 branch from 0a5f952 to a5ecc6c Compare January 16, 2023 10:56

dgrisonnet mentioned this pull request Jan 19, 2023

Expose metrics about resource requests and limits that represent the pod model kubernetes/enhancements#1748

Closed

6 tasks

paulfantom requested changes Jan 31, 2023

View reviewed changes

dgrisonnet mentioned this pull request Feb 1, 2023

KEP-1748: update to stable kubernetes/enhancements#3810

Merged

rexagod force-pushed the mon-2823 branch from c9ad841 to 6f139d6 Compare February 14, 2023 07:54

rexagod force-pushed the mon-2823 branch from 6f139d6 to eb73d96 Compare February 14, 2023 08:58

rexagod requested a review from simonpasquier February 20, 2023 11:39

rexagod force-pushed the mon-2823 branch 2 times, most recently from 8102d0c to eb73d96 Compare March 11, 2024 23:13

rexagod changed the title ~~Use kube-scheduler's metrics instead of kube-state-metrics~~ Use kube-scheduler's metrics instead of kube-state-metrics' Mar 12, 2024

rexagod force-pushed the mon-2823 branch 2 times, most recently from ce7c3f8 to 817b784 Compare March 12, 2024 14:06

github-actions bot added the stale label Sep 15, 2024

github-actions bot closed this Sep 23, 2024

rexagod reopened this Sep 23, 2024

rexagod requested a review from povilasv as a code owner September 23, 2024 10:59

rexagod force-pushed the mon-2823 branch 2 times, most recently from 8f38753 to 8102d0c Compare September 23, 2024 19:01

Prefer kube-scheduler's resource metrics to kube-state-metrics'

67a87ba

Since they are more accurate.

rexagod force-pushed the mon-2823 branch 2 times, most recently from 181ee93 to 6997e03 Compare September 23, 2024 22:44

fixup! Prefer kube-scheduler's resource metrics to kube-state-metrics'

86d83ae

Refactor kube_pod_status_phase, since statuses other than "Pending" or "Running" are excluded or deprecated. Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

rexagod force-pushed the mon-2823 branch from 6997e03 to 86d83ae Compare September 23, 2024 23:04

rexagod changed the title ~~Use kube-scheduler's metrics instead of kube-state-metrics'~~ Prefer kube-scheduler's resource metrics to kube-state-metrics' Sep 23, 2024

github-actions bot removed the stale label Sep 24, 2024

skl reviewed Sep 24, 2024

View reviewed changes

simonpasquier reviewed Sep 24, 2024

View reviewed changes

fixup! fixup! Prefer kube-scheduler's resource metrics to kube-state-…

921f62f

…metrics' reintroduce dropped rules

rexagod force-pushed the mon-2823 branch from a27bcc3 to 921f62f Compare September 25, 2024 12:30

skl reviewed Sep 25, 2024

View reviewed changes

simonpasquier reviewed Sep 27, 2024

View reviewed changes

github-actions bot added the stale label Oct 28, 2024

github-actions bot closed this Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prefer kube-scheduler's resource metrics to kube-state-metrics' #815

Prefer kube-scheduler's resource metrics to kube-state-metrics' #815

rexagod commented Jan 10, 2023 •

edited

Loading

paulfantom left a comment

rexagod commented Feb 14, 2023 •

edited

Loading

rexagod commented Feb 14, 2023 •

edited

Loading

github-actions bot commented Sep 15, 2024

rexagod commented Sep 23, 2024 •

edited

Loading

skl Sep 24, 2024

rexagod Sep 25, 2024 •

edited

Loading

skl Sep 25, 2024

simonpasquier Sep 27, 2024

simonpasquier Sep 24, 2024

rexagod Sep 25, 2024 •

edited

Loading

skl Sep 25, 2024

simonpasquier Sep 27, 2024

skl Sep 25, 2024

skl Sep 25, 2024

skl Sep 25, 2024

simonpasquier Sep 27, 2024

simonpasquier Sep 27, 2024

simonpasquier Sep 27, 2024

simonpasquier Sep 27, 2024

github-actions bot commented Oct 28, 2024

	(kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s})
	(kube_pod_resource_limit{resource="cpu",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s})

	kube_pod_resource_limit{resource="memory",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s}
	kube_pod_resource_limit{resource="cpu",%(kubeSchedulerSelector)s} or kube_pod_container_resource_limits{resource="cpu",%(kubeStateMetricsSelector)s}

Prefer kube-scheduler's resource metrics to kube-state-metrics' #815

Prefer kube-scheduler's resource metrics to kube-state-metrics' #815

Conversation

rexagod commented Jan 10, 2023 • edited Loading

paulfantom left a comment

Choose a reason for hiding this comment

rexagod commented Feb 14, 2023 • edited Loading

rexagod commented Feb 14, 2023 • edited Loading

github-actions bot commented Sep 15, 2024

rexagod commented Sep 23, 2024 • edited Loading

Choose a reason for hiding this comment

rexagod Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rexagod Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Oct 28, 2024

rexagod commented Jan 10, 2023 •

edited

Loading

rexagod commented Feb 14, 2023 •

edited

Loading

rexagod commented Feb 14, 2023 •

edited

Loading

rexagod commented Sep 23, 2024 •

edited

Loading

rexagod Sep 25, 2024 •

edited

Loading

rexagod Sep 25, 2024 •

edited

Loading