Setting custom timestamps in deltatocumulative processor #36457

owenzhangdd · 2024-11-19T20:14:17Z

owenzhangdd
Nov 19, 2024

Hi, I have a bit of a unique use case for the deltatocumulative processor. Prior to data arriving in this processor, we're doing a custom aggregation in which we do a giant group-by and aggregate out predefined attributes. The inputs to our custom aggregation are delta sums and histograms. We of course need to set a new timestamp for each aggregated data point. What we've been using is: start timestamp = time we begin computing the aggregation, and timestamp = time we finish computing the aggregation. This will naturally lead to a lot of gaps (the time between calculations), which doesn't quite conform to what the metrics data model expects of delta metrics, which is - all the [startTimestamp, timestamp] intervals for data points in a time series cover a full, unbroken span of time without overlaps.

We could fix this by making our custom aggregation stateful, recording the last aggregation's timestamp for each timeseries, and using that as the startTimestamp of the next calculation. But I first want to understand if our current configuration is likely to cause the unexpected behavior we currently see, which is: we generally find that with large values of max_stale (anything 1h or greater), over time we get way higher values for our time series than expected, and based on our specific context we know these values are too high. This difference is especially pronounced for time series that are sparse - time series that track errors/failures are farther off compared to success metrics, and failure metrics tend to be more sparse.

The reverse is true too - with max_stale=5m failures/sparse metrics are under-reported. What we'd expect is that low values of max_stale will under-report rates/increases for sparse metrics, but converge on the correct value the higher we increase max_stale. Instead, what we see is that higher max_stale overshoots and starts to over-report severely, even with values as low as 1hr.

So before we invest time into making our custom aggregation stateful, we first want to understand if our current setup is likely the cause of the data discrepancy we see. Additional details on how the deltatocumulative processor works would help as well. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting custom timestamps in deltatocumulative processor #36457

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Setting custom timestamps in deltatocumulative processor #36457

owenzhangdd Nov 19, 2024

Replies: 0 comments

owenzhangdd
Nov 19, 2024