You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are seeing the following errors from Alertmanager. Prometheus monitoring/prometheus-prometheus-operator-kube-p-prometheus-1 has failed to evaluate 4 rules in the last 5m.
It appears to be happening every few hours, for a few minutes. Here is an image for reference:
Looking at prometheus logs we see: ts=2022-06-26T13:48:58.529Z caller=manager.go:609 level=warn component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: node_namespace_pod_container:container_memory_rss\nexpr: container_memory_rss{image!=\"\",job=\"kubelet\",metrics_path=\"/metrics/cadvisor\"}\n * on(namespace, pod) group_left(node) topk by(namespace, pod) (1, max by(namespace,\n pod, node) (kube_pod_info{node!=\"\"}))\n" err="multiple matches for labels: grouping labels must ensure unique matches"
I am not sure if it's a specific thing in our cluster that triggers those errors but it looks like node_namespace_pod_container:container_memory_rss causes these errors. This is the query: container_memory_rss{image!="",job="kubelet",metrics_path="/metrics/cadvisor"} * on(namespace, pod) group_left(node) topk by(namespace, pod) (1, max by(namespace, pod, node) (kube_pod_info{node!=""}))
The text was updated successfully, but these errors were encountered:
We are seeing the following errors from Alertmanager.
Prometheus monitoring/prometheus-prometheus-operator-kube-p-prometheus-1 has failed to evaluate 4 rules in the last 5m.
It appears to be happening every few hours, for a few minutes. Here is an image for reference:
Looking at prometheus logs we see:
ts=2022-06-26T13:48:58.529Z caller=manager.go:609 level=warn component="rule manager" group=k8s.rules msg="Evaluating rule failed" rule="record: node_namespace_pod_container:container_memory_rss\nexpr: container_memory_rss{image!=\"\",job=\"kubelet\",metrics_path=\"/metrics/cadvisor\"}\n * on(namespace, pod) group_left(node) topk by(namespace, pod) (1, max by(namespace,\n pod, node) (kube_pod_info{node!=\"\"}))\n" err="multiple matches for labels: grouping labels must ensure unique matches"
I am not sure if it's a specific thing in our cluster that triggers those errors but it looks like
node_namespace_pod_container:container_memory_rss
causes these errors. This is the query:container_memory_rss{image!="",job="kubelet",metrics_path="/metrics/cadvisor"} * on(namespace, pod) group_left(node) topk by(namespace, pod) (1, max by(namespace, pod, node) (kube_pod_info{node!=""}))
The text was updated successfully, but these errors were encountered: