You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am looking at langchain instrumentation using OpenTelemetry, including existing approaches such as openinference and openllmetry, as well as the langchain tracer itself for langsmith, which doesn't use OpenTelemetry. The abstractions seem to be the same in python and JS so this discussion is meant to apply to both and the concepts should apply to any system wanting to do tracing, regardless of OpenTelemetry or not.
Both approaches using callbacks to start and stop spans around points in the callback tree, either using the BaseTracer abstraction or normal callbacks, implementing start and end entrypoints.
This makes sense given it's generally best to use abstractions provided by libraries when instrumenting them, in this case the langchain callback handlers. However, because of the entry points being start/end, we miss one important feature of tracing which is context (trace ID/span ID, not LLM context) propagation between decoupled libraries. This is important so that e.g., the chat model span generated within langchain can be the parent of a span generated in the openai client library, which can be a parent of a http span that has its context sent as HTTP headers to a backend.
This is usually done by propagating some sort of global state between calls to i.e. the OTel API from different libraries, with mechanisms such as contextvars in Python, AsyncLocalStorage in JS, ThreadLocal in Java, etc. However, we can't implement this pattern with start/end endpoints because we need the code within the scope of a context to be contiguous. For example, in Python, a context manager is used
withself.tracer.start_as_current_span(
...
) asspan:
requests.get("/hello") # This HTTP call will parent to the above span
An alternative would be to allow returning a value from start that is passed to end. However, while I think this would work in Python I don't think JS supports the pattern, and either way the current callback interface doesn't allow for this. It's always considered better to stick to a contiguous approach like context manager if at all possible so if we need a new approach I would suggest aiming for it.
I am wondering if it would be possible to enhance the callback mechanism to allow implementations to propagate context around the operations? A common pattern is to provide a single method with a next function.
defon_chain(
self,
next: Callable[[dict[str, Any], dict[str, Any], ...], dict[str, Any]],
serialized: dict[str, Any],
inputs: dict[str, Any],
**kwargs: Any,
):
withself.tracer.start_as_current_span(
# Use inputs to set span attributes
) asspan:
outputs=next(serialized, inputs, **kwargs)
# Use outputs to set span attributesreturnoutputs
etc. One approach would be to add these to the existing callbacks interface, or it could make sense to have a separate one. These methods should end up as a "superset" of the existing ones, in the sense that while a context propagating callback handler can't be mapped to the existing pattern, the opposite should be possible. This means, the business logic could use just the new scheme while user callbacks of the previous format can be converted when provided.
(this isn't meant to be precise code but the general gist)
Would it make sense to make such a change to langchain? Happy to hear any thoughts on any way forward, or not if it's too big of a change, on being able to support callbacks that require context propagation like above. Thanks.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I am looking at langchain instrumentation using OpenTelemetry, including existing approaches such as openinference and openllmetry, as well as the langchain tracer itself for langsmith, which doesn't use OpenTelemetry. The abstractions seem to be the same in python and JS so this discussion is meant to apply to both and the concepts should apply to any system wanting to do tracing, regardless of OpenTelemetry or not.
Both approaches using callbacks to start and stop spans around points in the callback tree, either using the
BaseTracer
abstraction or normal callbacks, implementingstart
andend
entrypoints.https://github.com/Arize-ai/openinference/blob/main/python/instrumentation/openinference-instrumentation-langchain/src/openinference/instrumentation/langchain/_tracer.py#L125
https://github.com/traceloop/openllmetry/blob/main/packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py#L481
This makes sense given it's generally best to use abstractions provided by libraries when instrumenting them, in this case the langchain callback handlers. However, because of the entry points being start/end, we miss one important feature of tracing which is context (trace ID/span ID, not LLM context) propagation between decoupled libraries. This is important so that e.g., the chat model span generated within langchain can be the parent of a span generated in the openai client library, which can be a parent of a http span that has its context sent as HTTP headers to a backend.
This is usually done by propagating some sort of global state between calls to i.e. the OTel API from different libraries, with mechanisms such as
contextvars
in Python,AsyncLocalStorage
in JS,ThreadLocal
in Java, etc. However, we can't implement this pattern with start/end endpoints because we need the code within the scope of a context to be contiguous. For example, in Python, a context manager is usedAn alternative would be to allow returning a value from
start
that is passed toend
. However, while I think this would work in Python I don't think JS supports the pattern, and either way the current callback interface doesn't allow for this. It's always considered better to stick to a contiguous approach like context manager if at all possible so if we need a new approach I would suggest aiming for it.I am wondering if it would be possible to enhance the callback mechanism to allow implementations to propagate context around the operations? A common pattern is to provide a single method with a
next
function.etc. One approach would be to add these to the existing callbacks interface, or it could make sense to have a separate one. These methods should end up as a "superset" of the existing ones, in the sense that while a context propagating callback handler can't be mapped to the existing pattern, the opposite should be possible. This means, the business logic could use just the new scheme while user callbacks of the previous format can be converted when provided.
(this isn't meant to be precise code but the general gist)
Would it make sense to make such a change to langchain? Happy to hear any thoughts on any way forward, or not if it's too big of a change, on being able to support callbacks that require context propagation like above. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions