Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: Update Created Timestamps syntax for OpenMetrics #43

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

Maniktherana
Copy link

@Maniktherana Maniktherana commented Dec 14, 2024

This PR adds a proposal to update the syntax of Created Timestamps in the OpenMetrics text exposition format.

Aims to address prometheus/prometheus#14823

Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
Signed-off-by: Manik Rana <manikrana54@gmail.com>
@Maniktherana Maniktherana marked this pull request as ready for review December 14, 2024 17:47
@Maniktherana
Copy link
Author

@bwplotka @ArthurSens

@Maniktherana Maniktherana changed the title feat: Update Created Timestamps syntax for OpenMetrics proposal: Update Created Timestamps syntax for OpenMetrics Dec 14, 2024
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Copy link
Member

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start; thanks for working on this, Manik!

I've left a few questions and stylistic comments


### Pitfalls of the current solution

As stated above, `_created` lines can appear anywhere after its associated metric line. This means parser is required to store the current position of the lexer before we start searching for the created timestamp. Currently, we cache the timestamp and minimize data stored in memory. Before, we used to make a deep copy of the parser at every line whenever the `CreatedTimestamp` function was called which lead to [Prometheus consuming Gigabytes more memory](https://github.com/prometheus/prometheus/issues/14808) than needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to mention how the parser used to look like. That implementation was not OM's fault; it was just us iterating over the implementation until we found a more efficient solution 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
As stated above, `_created` lines can appear anywhere after its associated metric line. This means parser is required to store the current position of the lexer before we start searching for the created timestamp. Currently, we cache the timestamp and minimize data stored in memory. Before, we used to make a deep copy of the parser at every line whenever the `CreatedTimestamp` function was called which lead to [Prometheus consuming Gigabytes more memory](https://github.com/prometheus/prometheus/issues/14808) than needed.
As stated above, `_created` lines can appear anywhere after its associated metric line. This means parser is required to store the current position of the lexer before we start searching for the created timestamp. Currently, we cache the timestamp and minimize data stored in memory but there is still significant CPU overhead.

foo_total{le="1"} 10.0
```

### Summaries and Histograms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any preference between the two proposals?

For proposal 1: Do we know if that's a complicated thing for SDKs?
For proposal 2: What should the parser do if it sees different CTs for the same histogram/summary?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually haven't thought of this
For 1: I may need to dig into how CTs are applied.
For 2: Good question. Not sure if there's a right answer here besides going with 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention what you propose, and move other to alternatives. I think I would put it on every line - simpler and we are chatting about the idea of single line for each complex type too: prometheus/OpenMetrics#283

Maniktherana and others added 7 commits December 17, 2024 22:45
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
I should use grammarly

Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Signed-off-by: Manik Rana <Manikrana54@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I would go for this. Even in the event of we would do a significant change to the format following the prometheus/OpenMetrics#283 it might be still a relevant and valid direction for CT format.

Also, amazingly well written proposal, thanks for this!

## Goals

* Prometheus can parse Created Timestamps without holding state between lines.
* OpenMetrics has an updated specification with a clear and precise syntax.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "We don't produce artificial metrics for CT semantics."


## How

We store the Created Timestamp inline with the attached metric and remove `_created` lines. This will allow the parser to immediately associate the CT with the metric without having to search for it. For counters, it's straightforward, as we can just add the timestamp after the metric value like an exemplar. To separate it from a traditional timestamp, we can prefix the created timestamp with something like `ct@`, although this portion of the syntax is not final and can be changed. Furthermore, we can order the created timestamp such that it is after the metric value + timestamp and before an exemplar.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We store the Created Timestamp inline with the attached metric and remove `_created` lines. This will allow the parser to immediately associate the CT with the metric without having to search for it. For counters, it's straightforward, as we can just add the timestamp after the metric value like an exemplar. To separate it from a traditional timestamp, we can prefix the created timestamp with something like `ct@`, although this portion of the syntax is not final and can be changed. Furthermore, we can order the created timestamp such that it is after the metric value + timestamp and before an exemplar.
We store the Created Timestamp inline with the attached metric and remove `_created` lines. This will allow the parser to immediately associate the CT with the metric without having to search for it. For counters, it's straightforward, as we can just add the timestamp after the metric value like an exemplar. To separate it from a traditional timestamp, we can prefix the created timestamp with something like `ct@`, although this portion of the syntax is not final and can be changed. We propose to force a certain order of the created timestamp such that it is after the metric value and the timestamp and before an exemplar.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention the timestamp format/unit (that it is the same as the OM timestamp).


## Goals

* Prometheus can parse Created Timestamps without holding state between lines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could put less detailed goal e.g. parsing is efficient

This is because it's fine to have some state e.g. HELP and TYPE, UNIT are generally ok because it's clear when they change (no need to recompute or compare series)

foo_total{le="1"} 10.0
```

### Summaries and Histograms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention what you propose, and move other to alternatives. I think I would put it on every line - simpler and we are chatting about the idea of single line for each complex type too: prometheus/OpenMetrics#283

```
# HELP foo Counter with and without labels to certify CT is parsed for both cases
# TYPE foo counter
# CREATED 1520872607.123; {a="b"} 1520872607.123; {le="c"} 1520872621.123
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one option, but another option would be to do CREATED <ts> and then separate "group" when this changes. This is tricky because it in the past the group (metric family) should not duplicate. We try to challenge this in other discussions/proposals though: prometheus/OpenMetrics#280

Comment on lines +5 to +6
* Arthur Silva Sens [@ArthurSens](https://github.com/ArthurSens)
* Bartłomiej Płotka [@bwplotka](https://github.com/bwplotka)
Copy link
Member

@bwplotka bwplotka Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Arthur Silva Sens [@ArthurSens](https://github.com/ArthurSens)
* Bartłomiej Płotka [@bwplotka](https://github.com/bwplotka)
**Contributors:**
* Arthur Silva Sens [@ArthurSens](https://github.com/ArthurSens)
* Bartłomiej Płotka [@bwplotka](https://github.com/bwplotka)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants