Context propagation within callbacks #27954

anuraaga · 2024-11-07T04:45:46Z

anuraaga
Nov 7, 2024

I am looking at langchain instrumentation using OpenTelemetry, including existing approaches such as openinference and openllmetry, as well as the langchain tracer itself for langsmith, which doesn't use OpenTelemetry. The abstractions seem to be the same in python and JS so this discussion is meant to apply to both and the concepts should apply to any system wanting to do tracing, regardless of OpenTelemetry or not.

Both approaches using callbacks to start and stop spans around points in the callback tree, either using the BaseTracer abstraction or normal callbacks, implementing start and end entrypoints.

https://github.com/Arize-ai/openinference/blob/main/python/instrumentation/openinference-instrumentation-langchain/src/openinference/instrumentation/langchain/_tracer.py#L125
https://github.com/traceloop/openllmetry/blob/main/packages/opentelemetry-instrumentation-langchain/opentelemetry/instrumentation/langchain/callback_handler.py#L481

This makes sense given it's generally best to use abstractions provided by libraries when instrumenting them, in this case the langchain callback handlers. However, because of the entry points being start/end, we miss one important feature of tracing which is context (trace ID/span ID, not LLM context) propagation between decoupled libraries. This is important so that e.g., the chat model span generated within langchain can be the parent of a span generated in the openai client library, which can be a parent of a http span that has its context sent as HTTP headers to a backend.

This is usually done by propagating some sort of global state between calls to i.e. the OTel API from different libraries, with mechanisms such as contextvars in Python, AsyncLocalStorage in JS, ThreadLocal in Java, etc. However, we can't implement this pattern with start/end endpoints because we need the code within the scope of a context to be contiguous. For example, in Python, a context manager is used

with self.tracer.start_as_current_span(
  ...
        ) as span:
  requests.get("/hello") # This HTTP call will parent to the above span

An alternative would be to allow returning a value from start that is passed to end. However, while I think this would work in Python I don't think JS supports the pattern, and either way the current callback interface doesn't allow for this. It's always considered better to stick to a contiguous approach like context manager if at all possible so if we need a new approach I would suggest aiming for it.

I am wondering if it would be possible to enhance the callback mechanism to allow implementations to propagate context around the operations? A common pattern is to provide a single method with a next function.

def on_chain(
        self,
        next: Callable[[dict[str, Any], dict[str, Any], ...], dict[str, Any]],
        serialized: dict[str, Any],
        inputs: dict[str, Any],
        **kwargs: Any,
):
  with self.tracer.start_as_current_span(
    # Use inputs to set span attributes
        ) as span:
    outputs = next(serialized, inputs, **kwargs)
    # Use outputs to set span attributes
    return outputs

etc. One approach would be to add these to the existing callbacks interface, or it could make sense to have a separate one. These methods should end up as a "superset" of the existing ones, in the sense that while a context propagating callback handler can't be mapped to the existing pattern, the opposite should be possible. This means, the business logic could use just the new scheme while user callbacks of the previous format can be converted when provided.

class LegacyCallbacksAdapter:
  _callbacks

  def on_chain(
        self,
        next: Callable[[dict[str, Any], dict[str, Any], ...], dict[str, Any]],
        serialized: dict[str, Any],
        inputs: dict[str, Any],
        **kwargs: Any,
):
  legacy_kwargs = kwargs.copy()
  run_id = legacy_kwargs.pop('run_id', None)
  parent_run_id = legacy_kwargs.pop('parent_run_id', None)
  tags = legacy_kwargs.pop('tags', None)
  metadata = legacy_kwargs.pop('metadata', None)
  self._callbacks.on_chain_start(serialized, inputs, run_id=run_id, parent_run_id=parent_run_id, tags=tags, metadata=metadata, **legacy_kwargs)
  outputs = next(serialized, inputs, **kwargs)
  self._callbacks.on_chain_end(outputs, run_id=run_id, parent_run_id=parent_run_id, **legacy_kwargs)

(this isn't meant to be precise code but the general gist)

Would it make sense to make such a change to langchain? Happy to hear any thoughts on any way forward, or not if it's too big of a change, on being able to support callbacks that require context propagation like above. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context propagation within callbacks #27954

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Context propagation within callbacks #27954

anuraaga Nov 7, 2024

Replies: 0 comments

anuraaga
Nov 7, 2024