Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lack CPU/Memory total allocatable vs requested #749

Open
tcharette opened this issue Mar 8, 2022 · 1 comment
Open

Lack CPU/Memory total allocatable vs requested #749

tcharette opened this issue Mar 8, 2022 · 1 comment
Labels

Comments

@tcharette
Copy link

tcharette commented Mar 8, 2022

Hey gang,

I find there's a hole in the Compute Resources / Cluster dashboard for a view that shows the total available Memory/CPU against the currently requested limits across all namespaces to find if your K8S cluster is close to being over-allocated.

I saw a similar PR merged here: #708 but this isn't per node per cluster.

Essentially what i'm asking: should I make a PR to include this new board, or does the information exist and I'm too dumb to see it?

The Computer Resources / Cluster dashboard has some percentage that has similar calculations, but lacks any way to see which node is allocated by how much, or if the spare room is on stateful node or not.

If this already exists my apologies!

If you don't want to see the dashboard, heres a picture instead.

image

I've provided an example dashboard of what I mean
{ "annotations": { "list": [ { "builtIn": 1, "datasource": "-- Grafana --", "enable": true, "hide": true, "iconColor": "rgba(0, 211, 255, 1)", "name": "Annotations & Alerts", "target": { "limit": 100, "matchAny": false, "tags": [], "type": "dashboard" }, "type": "dashboard" } ] }, "editable": true, "fiscalYearStartMonth": 0, "graphTooltip": 0, "id": 225, "iteration": 1645813499547, "links": [], "liveNow": false, "panels": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisLabel": "", "axisPlacement": "auto", "axisSoftMin": 0, "barAlignment": 0, "drawStyle": "line", "fillOpacity": 100, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineWidth": 1, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "never", "spanNulls": false, "stacking": { "group": "A", "mode": "normal" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null } ] }, "unit": "decbytes" }, "overrides": [ { "matcher": { "id": "byFrameRefID", "options": "B" }, "properties": [ { "id": "custom.fillOpacity", "value": 0 } ] } ] }, "gridPos": { "h": 8, "w": 12, "x": 0, "y": 0 }, "id": 4, "options": { "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, "tooltip": { "mode": "single" } }, "repeat": "instance", "repeatDirection": "h", "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "exemplar": true, "expr": "sum by (node) (kube_pod_container_resource_limits{resource=\"memory\"})", "format": "time_series", "hide": false, "interval": "", "legendFormat": "{{node}} limits", "refId": "A" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "exemplar": true, "expr": "sum(sum by(node) (kube_node_status_allocatable{resource=\"memory\"}) - sum by (node) (kube_pod_container_resource_limits{resource=\"memory\"}))", "hide": false, "interval": "", "intervalFactor": 1, "legendFormat": "Total Allocatable", "refId": "B" } ], "title": "Memory", "transformations": [], "type": "timeseries" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "fieldConfig": { "defaults": { "color": { "mode": "palette-classic" }, "custom": { "axisLabel": "", "axisPlacement": "auto", "axisSoftMin": 0, "barAlignment": 0, "drawStyle": "line", "fillOpacity": 80, "gradientMode": "none", "hideFrom": { "legend": false, "tooltip": false, "viz": false }, "lineInterpolation": "linear", "lineWidth": 1, "pointSize": 5, "scaleDistribution": { "type": "linear" }, "showPoints": "never", "spanNulls": false, "stacking": { "group": "A", "mode": "normal" }, "thresholdsStyle": { "mode": "off" } }, "mappings": [], "thresholds": { "mode": "absolute", "steps": [ { "color": "green", "value": null }, { "color": "red", "value": 80 } ] }, "unit": "vCPU" }, "overrides": [ { "matcher": { "id": "byFrameRefID", "options": "B" }, "properties": [ { "id": "custom.fillOpacity", "value": 0 } ] } ] }, "gridPos": { "h": 9, "w": 12, "x": 0, "y": 8 }, "id": 2, "options": { "legend": { "calcs": [], "displayMode": "list", "placement": "bottom" }, "tooltip": { "mode": "single" } }, "targets": [ { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "exemplar": true, "expr": "sum by (node) (kube_pod_container_resource_limits{resource=\"cpu\"})", "hide": false, "interval": "", "legendFormat": "{{node }} limits", "refId": "A" }, { "datasource": { "type": "prometheus", "uid": "${datasource}" }, "exemplar": false, "expr": "sum(sum by(node) (kube_node_status_allocatable{resource=\"cpu\"}) - sum by (node) (kube_pod_container_resource_limits{resource=\"cpu\"}))", "hide": false, "interval": "", "legendFormat": "Total allocatable", "refId": "B" } ], "title": "CPU", "type": "timeseries" } ], "refresh": false, "schemaVersion": 34, "style": "dark", "tags": [], "templating": { "list": [ { "current": { "selected": true, "text": "", "value": "" }, "hide": 0, "includeAll": false, "multi": false, "name": "datasource", "options": [], "query": "prometheus", "queryValue": "", "refresh": 1, "regex": "", "skipUrlSync": false, "type": "datasource" }, { "current": { "isNone": true, "selected": false, "text": "None", "value": "" }, "datasource": { "type": "prometheus", "uid": "${datasource}" }, "definition": "label_values(apiserver_request_total, cluster)", "hide": 2, "includeAll": false, "label": "cluster", "multi": false, "name": "cluster", "options": [], "query": { "query": "label_values(apiserver_request_total, cluster)", "refId": "StandardVariableQuery" }, "refresh": 2, "regex": "", "skipUrlSync": false, "sort": 1, "type": "query" }, { "current": { "selected": false, "text": "All", "value": "$__all" }, "datasource": { "type": "prometheus", "uid": "${datasource}" }, "definition": "label_values(apiserver_request_total{job=\"apiserver\", cluster=\"${cluster}\"}, instance)", "hide": 0, "includeAll": true, "label": "node", "multi": false, "name": "node", "options": [], "query": { "query": "label_values(apiserver_request_total{job=\"apiserver\", cluster=\"${cluster}\"}, instance)", "refId": "StandardVariableQuery" }, "refresh": 2, "regex": "", "skipUrlSync": false, "sort": 1, "type": "query" } ] }, "time": { "from": "2022-02-18T19:46:01.026Z", "to": "2022-02-18T19:46:31.682Z" }, "timepicker": {}, "timezone": "", "title": "Total Allocation by Environment", "uid": "vgjoNsJ7k", "version": 1, "weekStart": "" }

Copy link

github-actions bot commented Oct 1, 2024

This issue has not had any activity in the past 30 days, so the
stale label has been added to it.

  • The stale label will be removed if there is new activity
  • The issue will be closed in 7 days if there is no new activity
  • Add the keepalive label to exempt this issue from the stale check action

Thank you for your contributions!

@github-actions github-actions bot added the stale label Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant