You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have an application that takes HTTP calls and converts the information into a RabbitMQ publish. I use OpenTelemetry to report traces into Jaeger. When I run my application, and blast it with HTTP calls, it will consume all the CPU on its VM after about 6 minutes and the process has to be force killed. I commented out all of my OTel code and couldn't get the application to crash.
Because this application is managing publishing to RMQ for multiple services I couldn't just rely on one Resource. I needed each service to report into Jaeger as its own resource. Here's my code for setting this up:
from opentelemetry import trace, context
from opentelemetry.propagate import inject, extract
from opentelemetry.trace.status import Status, StatusCode
from opentelemetry.context import attach, detach
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
class BasePublisher(ABC):
name: str
tracer_provider: Optional[TracerProvider] = None
def set_tracer_provider(self):
tracer_provider = TracerProvider(resource=Resource.create({"service.name": self.name}))
jaeger_exporter = JaegerExporter(agent_host_name=env['JAEGER_ADDRESS'], agent_port=int(env['JAEGER_PORT']))
tracer_provider.add_span_processor(BatchSpanProcessor(jaeger_exporter))
self.tracer_provider = tracer_provider
async def publish(self, exchange: Exchange, msg: Union[str, bytes], routing_key: str, correlation_id: str=str(uuid.uuid4())):
if self.tracer_provider is None:
self.set_tracer_provider()
tracer = trace.get_tracer(self.name, tracer_provider=self.tracer_provider)
message = Message(...)
log.info(f"Publishing message with rouing key `{routing_key}`")
with tracer.start_as_current_span("publish") as span:
span.set_attribute("rmq.message.routing_key", routing_key)
span.set_attribute("rmq.message.correlation_id", correlation_id if correlation_id else False)
span.set_attribute("rmq.message.timestamp", str(message.timestamp))
# My application also handles consuming. The context is injected into my RMQ message so it can be used
# if a consume is a part of the trace
inject(message.headers, context=context.get_current())
await exchange.publish(message, routing_key, mandatory=False)
log.info("Message published")
handler.flush()
return "Message published successfully"
Each service implements the BasePublisher class. I'm new to Python but I think my issue here is that I'm creating a lot tracer_providers. After the 10 millionth once is created then things start to crash. Interestingly enough there is no memory spike when the crash does happen. My reason for thinking this is that every time I call MyService.publish() I can see that it has to call self.set_tracer_provder(). I would hope that that only has to get called once per application run.
This is a stripped down sample of my code. I can provide more context if needed. Any help would be appreciated. Thanks.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
I have an application that takes HTTP calls and converts the information into a RabbitMQ publish. I use OpenTelemetry to report traces into Jaeger. When I run my application, and blast it with HTTP calls, it will consume all the CPU on its VM after about 6 minutes and the process has to be force killed. I commented out all of my OTel code and couldn't get the application to crash.
Because this application is managing publishing to RMQ for multiple services I couldn't just rely on one
Resource
. I needed each service to report into Jaeger as its own resource. Here's my code for setting this up:Each service implements the
BasePublisher
class. I'm new to Python but I think my issue here is that I'm creating a lottracer_providers
. After the 10 millionth once is created then things start to crash. Interestingly enough there is no memory spike when the crash does happen. My reason for thinking this is that every time I callMyService.publish()
I can see that it has to callself.set_tracer_provder()
. I would hope that that only has to get called once per application run.This is a stripped down sample of my code. I can provide more context if needed. Any help would be appreciated. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions