You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Loki ruler alerts not firing using latest loki helm chart running in Distributed mode.
For the record, loki-distributed helm chart is able to process these rules and alert on them with no issues.
Querier frontend is reporting status=200 and querier reports status=500 and total_bytes=0B for both attempts for the alert rule. Both of these requests have source=ruler:
Additionally I thought it was a querier issue so i've messed around with frontend_address and scheduler_address with the frontend/worker settings which have also produced the same results.
Ive been trying to debug this issue now for over a week and nothing seems to come to fruition. Looking for some ideas as to what could be going on here. Thank you for your time!
The text was updated successfully, but these errors were encountered:
For us the problem was related to using the multi-tenant setup. In such case the ruler has to be proxied (and augmented with extra headers) to the the query frontend to allow it to obtain the logs data from all the tenants. Otherwise the rule expressions keep returning empty lines in the ruler, although the query is shown to succeed.
EDIT:
What's important to add is that this setup worked for us before Loki 3.1. Something must have changed that now requires the ruler to evaluate queries via an additional proxy.
Describe the bug
Loki ruler alerts not firing using latest loki helm chart running in
Distributed
mode.For the record,
loki-distributed
helm chart is able to process these rules and alert on them with no issues.To Reproduce
Steps to reproduce the behavior:
(sum(count_over_time({job=~".+", env="env"} |~ ".+level=warn.+"[30s]) >= 1))
Expected behavior
Alerts sent for these queries
Environment:
Screenshots, Promtail config, or terminal output
Here is an output of all the pods we have running from the
loki
helm chart.Here is the current deployment from
helm ls
loki <namespace> 72 2024-11-06 10:58:15.215559 -0700 MST deployed loki-6.18.0 3.2.0
Ruler configuration:
Rule in question which has been setup as
/etc/loki/rules/fake/fake
changing undotted file to.txt
or.yaml
changes nothing:logcli through the gateway returns data for this particular query_range:
Querier frontend is reporting
status=200
and querier reportsstatus=500
andtotal_bytes=0B
for both attempts for the alert rule. Both of these requests havesource=ruler
:From logcli query through the
gateway
address logstotal_bytes=340MB
:Additionally I thought it was a querier issue so i've messed around with
frontend_address
andscheduler_address
with the frontend/worker settings which have also produced the same results.Ive been trying to debug this issue now for over a week and nothing seems to come to fruition. Looking for some ideas as to what could be going on here. Thank you for your time!
The text was updated successfully, but these errors were encountered: