Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: telemetry in msg handlers runs multiple times leading to unexpected metric values #21614

Open
1 task done
migueldingli1997 opened this issue Sep 9, 2024 · 3 comments · May be fixed by #21963
Open
1 task done
Assignees
Labels

Comments

@migueldingli1997
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Message logic is executed at least twice, once in checkTx and once during block finalization. Telemetry logic does not distinguish between these two modes and runs regardless, meaning that counters are incremented at least twice.

One example of this is the new_account counter which is incremented when we send tokens to a new account. The result is that this counter actually increments by 2 with each new account.

I suppose it could increment even more times if we run in other modes, such as to simulate a transaction, but I haven't experimented with this.

Cosmos SDK Version

0.50.9

How to reproduce?

  • Run a chain with Prometheus telemetry enabled
  • Send tokens to an account that does not exist
  • Look up the new_account metric at :1317/metrics?format=prometheus
@julienrbrt julienrbrt self-assigned this Sep 27, 2024
@julienrbrt
Copy link
Member

I am not sure how best we can fix it.
IMHO we shouldn't need to query the transaction mode to enrich the metrics each time.
On the other hand, do we only want to collect metrics on DeliverTx? I would guess not as well.

@kocubinski
Copy link
Member

In general it's the telemetry system's job to report what is happening, so if the same code path is being executed multiple times during execution telemetry should log it. Maybe a discriminator label can be attached for execution mode?

In particular though, this metric name seems off, it's incorrect to assume a new account is created every time coins are sent to it and it doesn't already exist (if I read that right).

defer telemetry.IncrCounter(1, "new", "account")

@julienrbrt
Copy link
Member

I somewhat agree with @kocubinski above. Still not convinced about adding a round-trip to the tx service, however.
I'd say probably, if you want accuracy, you should be listening to events.
We can add a note on this behavior in the documentation (https://docs.cosmos.network/main/learn/advanced/telemetry).
Would it be an acceptable solution for you?

@julienrbrt julienrbrt linked a pull request Sep 27, 2024 that will close this issue
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 📋 Backlog
Development

Successfully merging a pull request may close this issue.

3 participants