Propagated traces have incorrect start time #698

omikader · 2024-10-01T19:52:26Z

Description

We are seeing strange delays in the start time for propagated traces emitted from our Flask API. Notice how the HTTP GET trace, which is emitted by our React SPA, starts and finishes before the traces emitted by the Flask API server seem to even start.

Steps to reproduce

Add Faro to the project and enable propagateTraceHeaderCorsUrls to ensure that the Traceparent header is sent to the backend server
Instrument a Flask server using OpenTelemetry's zero-code autoinstrumentation which automatically creates traces, metrics, and logs for popular Python libraries like flask and mysql-connector-python
Click around in the React app and verify that a value for Traceparent is also set in the request headers to the API server. This allows us to successfully correlate the traces emitted in Flask with the original web request from the React app.

Expected behavior

I expected the reported time in Grafana for this trace to be identical to the time reported by the network tab in the browser
I expected any propagated traces to start + finish within the timing of the top level trace associated with the web request from the dor-omar-frontend app

Actual behavior

The event sent to the collector suggests that the request took 66ms

The network tab suggests that the request took 28ms + 29ms = 57ms. This is inline with the 56ms reported by Grafana for the light-blue trace

The top-level suggests that it took 391.32ms which I believe to be due to the delayed reporting of the propagated traces.

Environment

SDK version: 1.10.0
SDK instrumentations: Web SDK, React, React Router, Web Tracing
Device type: laptop
Device name: MacBook Air 13"
OS: macOS Sonoma 14.6.1
Browser: Chrome 128.0.6613.120

The text was updated successfully, but these errors were encountered:

JordiPolo · 2024-10-02T00:41:06Z

I think this happens because the clock in the laptop and the clock in the server are 200ms apart.
Consumer laptops are not too hardcore about syncing continuously their clocks, being a couple of seconds up or down compared with some atomic clock in the atmosphere or something, does not make a difference to them.
Servers tend to be perfectly synced within the datacenter so this effect does not happen among services.

omikader · 2024-10-02T16:59:22Z

Hi @JordiPolo, thanks for the quick response! Given that this is the case, would you say it is not valuable to propagate traces from a client's device to actions taken on the server? Or perhaps is there a canonical strategy for resetting the start time for the client-side trace once it is captured by the Faro receiver?

JordiPolo · 2024-10-02T17:52:04Z

Maybe at some point there is a way to know the diff in clocks an a solution. Right now this is what you will get. To me still makes sense to have this information as long as the people who would be reading the trace understand why this looks like that.

omikader added the bug Report a bug label Oct 1, 2024

omikader mentioned this issue Oct 2, 2024

Adjust client/server span times to account for clock skew grafana/grafana#93060

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propagated traces have incorrect start time #698

Propagated traces have incorrect start time #698

omikader commented Oct 1, 2024

JordiPolo commented Oct 2, 2024 •

edited

Loading

omikader commented Oct 2, 2024

JordiPolo commented Oct 2, 2024

Propagated traces have incorrect start time #698

Propagated traces have incorrect start time #698

Comments

omikader commented Oct 1, 2024

Description

Steps to reproduce

Expected behavior

Actual behavior

Environment

JordiPolo commented Oct 2, 2024 • edited Loading

omikader commented Oct 2, 2024

JordiPolo commented Oct 2, 2024

JordiPolo commented Oct 2, 2024 •

edited

Loading