-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Update Created Timestamps syntax for OpenMetrics #43
base: main
Are you sure you want to change the base?
Changes from 10 commits
1d313cc
9531e66
768d1f8
be917f1
6f7889d
e24e60d
73a57fa
696b466
bbf069f
5f78950
f77c828
9b87407
a34f609
aea32d8
3a4bc9c
8f730b7
509eb0a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,151 @@ | ||||||
# Update Created Timestamps syntax for OpenMetrics | ||||||
|
||||||
* **Owners:** | ||||||
* Manik Rana [@Maniktherana](https://github.com/Maniktherana) | ||||||
* Arthur Silva Sens [@ArthurSens](https://github.com/ArthurSens) | ||||||
* Bartłomiej Płotka [@bwplotka](https://github.com/bwplotka) | ||||||
|
||||||
* **Implementation Status:** Not implemented | ||||||
|
||||||
* **Related Issues and PRs:** | ||||||
* [github.com/prometheus/prometheus/issues/14217](https://github.com/prometheus/prometheus/issues/14217) | ||||||
* [github.com/prometheus/prometheus/issues/14823](https://github.com/prometheus/prometheus/issues/14823) | ||||||
|
||||||
* **Other docs or links:** | ||||||
* [github.com/prometheus/proposals/blob/main/proposals/2023-06-13_created-timestamp.md](https://github.com/prometheus/proposals/blob/main/proposals/2023-06-13_created-timestamp.md) | ||||||
|
||||||
> TL;DR: We propose an updated syntax for handling created timestamps in the OpenMetrics exposition format. This syntax allows for more efficient parsing of created lines and eliminates confusion on naming metrics that support `_created` lines by placing the created timestamp inline with the metric it is attached to. | ||||||
|
||||||
## Why | ||||||
|
||||||
Created Timestamps (CTs) were proposed in the summer of 2023 with support for the OpenMetrics (OM) exposition format following in summer of 2024. Once CT lines could be parsed by Prometheus in OM text it felt as though the syntax could be improved upon to optimize how the parser picks up these CTs. CT lines can be characterized as followed: | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
* They are represented as a standalone line in the OM text exposition format | ||||||
* It is denoted with a metric name and `_created` suffix | ||||||
* It can appear immediately after its associated metric line or between other metric lines and can be placed several lines apart while sharing labels where applicable. | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
These characterisitics, specifically the final one means that the parser must search or "peek" ahead to find the `_created` line for a given metric with the same label set and thus, requires additional CPU/memory resources when it can be saved. | ||||||
|
||||||
This search operation can be specifically taxing when the CT line, if it exists, is the very last line in a large MetricFamily such as that of a histogram with many buckets. | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
### Pitfalls of the current solution | ||||||
|
||||||
As stated above, `_created` lines can appear anywhere after its associated metric line. This means parser is required to store the current position of the lexer before we start searching for the created timestamp. Currently, we cache the timestamp and minimize data stored in memory. Before, we used to make a deep copy of the parser at every line whenever the `CreatedTimestamp` function was called which lead to [Prometheus consuming Gigabytes more memory](https://github.com/prometheus/prometheus/issues/14808) than needed. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There's no need to mention how the parser used to look like. That implementation was not OM's fault; it was just us iterating over the implementation until we found a more efficient solution 🤔 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about
Suggested change
|
||||||
|
||||||
## Goals | ||||||
|
||||||
* Prometheus can efficiently parse Created Timestamps without peeking forward. | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
* OpenMetrics has an updated specification with a clear and precise syntax. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about "We don't produce artificial metrics for CT semantics." |
||||||
|
||||||
### Audience | ||||||
|
||||||
* Developers maintaining the OpenMetrics text parser. | ||||||
* Developers maintaining client libraries. | ||||||
* Users that utilize the OpenMetrics Text format with Created Timestamps enabled. | ||||||
|
||||||
## Non-Goals | ||||||
|
||||||
* Require backwards compatibility. While this can be a step towards an OM 2.0 this proposal only deals with created timestamps which is a small subset of the specification. | ||||||
* Dealing with the storage of CTs. | ||||||
* Add CT support to additional metric types like guages. | ||||||
|
||||||
## How | ||||||
|
||||||
We store the Created Timestamp inline with the metric it is attached to and we remove `_created` lines. This will allow the parser to immediately associate the CT with the metric without having to search for it. For counters its straightforward as we can just add the timestamp after the metric value like an exemplar. To separate it from a traditional timstamp we can prefix the created timestamp with something like `ct@`, although this portion of the syntax is not final and can be changed. Furthermore, we can order the created timestamp such that it is after the metric value + timestamp and before an exemplar. | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
Lets look at some examples to illustrate the difference. | ||||||
|
||||||
### Counters | ||||||
|
||||||
``` | ||||||
# HELP foo Counter with and without labels to certify CT is parsed for both cases | ||||||
# TYPE foo counter | ||||||
foo_total 17.0 1520879607.789 ct@1520872607.123 # {id="counter-test"} 5 | ||||||
foo_total{a="b"} 17.0 1520879607.789 ct@1520872607.123 # {id="counter-test"} 5 | ||||||
foo_total{le="c"} 21.0 ct1520872621.123 | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
foo_total{le="1"} 10.0 | ||||||
``` | ||||||
|
||||||
vs the current syntax: | ||||||
|
||||||
``` | ||||||
# HELP foo Counter with and without labels to certify CT is parsed for both cases | ||||||
# TYPE foo counter | ||||||
foo_total 17.0 1520879607.789 # {id="counter-test"} 5 | ||||||
foo_created 1520872607.123 | ||||||
foo_total{a="b"} 17.0 1520879607.789 # {id="counter-test"} 5 | ||||||
foo_created{a="b"} 1520872607.123 | ||||||
foo_total{le="c"} 21.0 | ||||||
foo_created{le="c"} 1520872621.123 | ||||||
foo_total{le="1"} 10.0 | ||||||
``` | ||||||
|
||||||
### Summaries and Histograms | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you have any preference between the two proposals? For proposal 1: Do we know if that's a complicated thing for SDKs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I actually haven't thought of this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's mention what you propose, and move other to alternatives. I think I would put it on every line - simpler and we are chatting about the idea of single line for each complex type too: prometheus/OpenMetrics#283 |
||||||
|
||||||
Summaries and histograms are a bit more complex as they have quantiles and buckets respectively. Moreoever, there is no defacto line like in a counter where we can place the CT. Thus, we can opt to place the CT on the first line of the metric with the same label set. We can then cache this timestamp with a hash of the label set and use it for all subsequent lines with the same label set. This is something we already do with the current syntax. | ||||||
|
||||||
A diff example (for brevity) of a summary metric with current vs proposed syntax: | ||||||
|
||||||
```diff | ||||||
# HELP rpc_durations_seconds RPC latency distributions. | ||||||
# TYPE rpc_durations_seconds summary | ||||||
+rpc_durations_seconds{service="exponential",quantile="0.5"} 7.689368882420941e-07 ct@1.7268398130168908e+09 | ||||||
-rpc_durations_seconds{service="exponential",quantile="0.5"} 7.689368882420941e-07 | ||||||
rpc_durations_seconds{service="exponential",quantile="0.9"} 1.6537614174305048e-06 | ||||||
rpc_durations_seconds{service="exponential",quantile="0.99"} 2.0965499063061924e-06 | ||||||
rpc_durations_seconds_sum{service="exponential"} 2.0318666372575776e-05 | ||||||
rpc_durations_seconds_count{service="exponential"} 22 | ||||||
-rpc_durations_seconds_created{service="exponential"} 1.7268398130168908e+09 | ||||||
``` | ||||||
|
||||||
With the current syntax `_created` line can be anywhere after the `quantile` lines. A histogram would look similar to the summary but with `le` labels instead of `quantile` labels. | ||||||
|
||||||
Another option is to simply place the CT on every line of a summary or histogram metric. This would be more verbose but would be more explicit and easier to parse and avoids storing the CT completely: | ||||||
|
||||||
``` | ||||||
# HELP rpc_durations_seconds RPC latency distributions. | ||||||
# TYPE rpc_durations_seconds summary | ||||||
rpc_durations_seconds{service="exponential",quantile="0.5"} 7.689368882420941e-07 ct@1.7268398130168908e+09 | ||||||
rpc_durations_seconds{service="exponential",quantile="0.9"} 1.6537614174305048e-06 ct@1.7268398130168908e+09 | ||||||
rpc_durations_seconds{service="exponential",quantile="0.99"} 2.0965499063061924e-06 ct@1.7268398130168908e+09 | ||||||
rpc_durations_seconds_sum{service="exponential"} 2.0318666372575776e-05 ct@1.7268398130168908e+09 | ||||||
rpc_durations_seconds_count{service="exponential"} 22 ct@1.7268398130168908e+09 | ||||||
``` | ||||||
|
||||||
### Backwards compatibility and semantic versioning | ||||||
|
||||||
This change is not backwards compatible and would break existing parsers that expect the `_created` line. OpenMetrics 1.x parsers that support `_created` lines would not be able to parse the new syntax. This would require a new major version of the OpenMetrics specification to be released, i.e, OpenMetrics 2.0. Any client libraries or tools that expose OpenMetrics text would also need to be updated to support the new syntax. | ||||||
|
||||||
## Alternatives | ||||||
|
||||||
### Do nothing | ||||||
|
||||||
Continue using the OM text format in its current state and further optimize CPU usage through PRs that build on [github.com/prometheus/prometheus/issues/14823](https://github.com/prometheus/prometheus/issues/14823). This is desirable if we wish to keep backwards compatibility and avoids breaking changes but we would have to live with an inefficient solution. | ||||||
|
||||||
### Storing CTs Using a `# HELP`-Like Syntax | ||||||
|
||||||
In addition to the `TYPE`, `UNIT`, and `HELP` fields, we can introduce a `# CREATED` line for metrics that have an associated creation timestamp (CT). This approach allows us to quickly determine whether a CT exists for a given metric, eliminating the need for a more time-consuming search process. By parsing the `# CREATED` line, we can associate it with a specific hash corresponding to the metric's label set, thereby mapping each CT to the correct metric. | ||||||
|
||||||
However, this method still involves the overhead of storing potentially multiple CTs in memory until we encounter the relevant metric line. Furthermore the `CREATED` line itself might look somewhat convoluted compared to `TYPE` , `UNIT` , and `HELP` which are very human readable. | ||||||
|
||||||
An example of how this might look: | ||||||
|
||||||
``` | ||||||
# HELP foo Counter with and without labels to certify CT is parsed for both cases | ||||||
# TYPE foo counter | ||||||
# CREATED 1520872607.123; {a="b"} 1520872607.123; {le="c"} 1520872621.123 | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is one option, but another option would be to do |
||||||
foo_total 17.0 1520879607.789 # {id="counter-test"} 5 | ||||||
foo_total{a="b"} 17.0 1520879607.789 # {id="counter-test"} 5 | ||||||
foo_total{le="c"} 21.0 | ||||||
foo_total{le="1"} 10.0 | ||||||
``` | ||||||
|
||||||
When the same MetricFamily has multiple label sets with their own CTs we'd have to place all the timestamps in a single line with the additional labels and separate them with delimiters. This does mitigate the increased verbosity described in the [Summaries and Histograms section](#summaries-and-histograms) but it still requires storing all the timestamps in memory for the lifetime of the MetricFamily. Even if we store the CT only for the first label set in a MetricFamily, we only need to keep a single timestamp in memory at a time for each label set in a summary or histogram until the next label set is encountered (assuming metrics with the same label sets are grouped together). | ||||||
|
||||||
## Action Plan | ||||||
|
||||||
In no particular order: | ||||||
|
||||||
* [ ] Update the OpenMetrics specification to include the new syntax for Created Timestamps. | ||||||
* [ ] Update Prometheus's OpenMetrics text parser to support the new syntax. | ||||||
* [ ] Update tooling and libraries that expose OpenMetrics text. | ||||||
Maniktherana marked this conversation as resolved.
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.