Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Some dashboards are broken on high pods count #129

Open
maxpain opened this issue Nov 9, 2024 · 7 comments
Open

[bug] Some dashboards are broken on high pods count #129

maxpain opened this issue Nov 9, 2024 · 7 comments
Assignees
Labels
bug Something isn't working

Comments

@maxpain
Copy link

maxpain commented Nov 9, 2024

Describe the bug

Hello. We have a lot of short-lived pods in our clusters. It's also a problem for frequent CronJobs.

Screen.Recording.2024-11-09.at.10.35.49.mov

How to reproduce?

No response

Expected behavior

No response

Additional context

No response

@EladAviczer
Copy link

Have you tried increasing the CPU resources allocated to Prometheus?

@maxpain
Copy link
Author

maxpain commented Nov 21, 2024

@EladAviczer did you watch the video?

@EladAviczer
Copy link

*victoriaMetrics

@maxpain
Copy link
Author

maxpain commented Nov 21, 2024

*victoriaMetrics

The problem is not in Prometheus/VictoriaMetrics, but in the grafana dashboard itself.

@EladAviczer
Copy link

EladAviczer commented Nov 21, 2024

The dashboard uses VictoriaMetrics to query the data, you get 422 Unprocessable Content error when calling the promql/metricsQL query.

I don't say that i'm 100% sure that it is a VictoriaMetrics problem but it could be and you should check it too. : )

@maxpain
Copy link
Author

maxpain commented Nov 21, 2024

The dashboard uses VictoriaMetrics to query the data, you get 422 Unprocessable Content error when calling the promql/metricsQL query.

The problem is that there are a lot of pods (because CronJob running every minute), and this dashboard tries to pass the array of pod names (1440 pods for last 24 hours), which will fail on any installation (Prometheus or VictoriaMetrics)

@dotdc
Copy link
Owner

dotdc commented Nov 22, 2024

Hi @maxpain,

The created_by variable was introduced to enable filtering on deployments, but if there are too many pods, you'll up end with a 422 Unprocessable Entity error as you just experienced.

I’ll check if there’s a better solution, but removing the created_by variable might work better in your case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants