Observered latency in the chain.invoke #28750

KalakondaSainath · 2024-12-16T21:25:55Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import os
from azure.identity import ClientSecretCredential, get_bearer_token_provider
from langchain_openai import AzureChatOpenAI

def llm_connection(model="gpt-4o",
temperature=0.2,
top_p=0.1,
max_tokens=2000,
max_retries=1):
credentials = {
"tenant_id": os.getenv("AZURE_TENANT_ID"),
"client_id": os.getenv("AZURE_CLIENT_ID"),
"client_secret": os.getenv("AZURE_CLIENT_SECRET"),
"openai_endpoint": os.getenv("API_BASE"),
"azure_api_version": os.getenv("AZURE_API_VERSION", "2024-04-01-preview"),
"subscription_key": os.getenv("SUBSCRIPTION_KEY"),
}

llm = instantiate_llm( credentials, 
                       azure_deployment = model, 
                       temperature = temperature,
                       top_p = top_p, 
                       max_tokens = max_tokens,
                       max_retries = max_retries)
return llm

def instantiate_llm(
credentials: dict,
azure_deployment: str = "gpt-4o",
temperature: float = 0.2,
top_p=0.1,
max_tokens: int = 1000,
max_retries: int = 1,
):
"""
Instantiate llm model
"""

csc = ClientSecretCredential(
    tenant_id=credentials["tenant_id"],
    client_id=credentials["client_id"],
    client_secret=credentials["client_secret"],
)

llm = AzureChatOpenAI(
    azure_endpoint=credentials["openai_endpoint"],
    api_version=credentials["azure_api_version"],
    azure_deployment=azure_deployment,
    azure_ad_token_provider=get_bearer_token_provider(
        csc, "https://cognitiveservices.azure.com/.default"
    ),
    default_headers={"Ocp-Apim-Subscription-Key": credentials["subscription_key"]},
    temperature=temperature,
    top_p=top_p,
    max_tokens=max_tokens,
    max_retries=max_retries,
)

return llm

prompt_template = ChatPromptTemplate.from_messages(
[("system", EMAIL_WRITER_SYSTEM)]
)
chain = prompt_template | llm | StrOutputParser()
response = chain.invoke(<request_payload>)

Error Message and Stack Trace (if applicable)

Retrying request to /chat/completions in 0.376881 seconds

Description

We have observed intermittent latency in the chain.invoke
At times it takes couple of minutes before it makes Open AI HTTP Post Request and there is no logging on the operation taking time.
With the same payload, at times the whole chain completes in 10 seconds where as we observe the latency with no logging and Error message Retrying request to /chat/completions in

We would like to understand the issue with langchain invoke on why there is latency observed in the

Note: both the requests are with same payload

Latency observed during this sequence of steps

No latency during this call -

System Info

$ pip freeze
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.6.2.post1
async-timeout==4.0.3
attrs==24.2.0
azure-core==1.32.0
azure-functions==1.21.3
azure-identity==1.19.0
beautifulsoup4==4.12.3
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.4.0
click==8.1.7
colorama==0.4.6
coverage==7.6.4
cryptography==43.0.3
databricks-sql-connector==3.4.0
dataclasses-json==0.6.7
distro==1.9.0
et_xmlfile==2.0.0
exceptiongroup==1.2.2
fastapi==0.115.4
frozenlist==1.5.0
greenlet==3.1.1
h11==0.14.0
httpcore==1.0.6
httpx==0.27.2
httpx-sse==0.4.0
idna==3.10
iniconfig==2.0.0
isodate==0.7.2
jiter==0.7.0
jsonpatch==1.33
jsonpointer==3.0.0
langchain==0.3.7
langchain-community==0.3.5
langchain-core==0.3.15
langchain-openai==0.2.6
langchain-text-splitters==0.3.2
langsmith==0.1.140
lxml==5.3.0
lz4==4.3.3
marshmallow==3.23.1
msal==1.31.0
msal-extensions==1.2.0
msrest==0.7.1
multidict==6.1.0
mypy-extensions==1.0.0
numpy==1.26.4
oauthlib==3.2.2
openai==1.54.3
openpyxl==3.1.5
orjson==3.10.11
packaging==24.1
pandas==2.2.0
pluggy==1.5.0
portalocker==2.10.1
propcache==0.2.0
py4j==0.10.9.5
pyarrow==16.1.0
pycparser==2.22
pydantic==2.9.2
pydantic-settings==2.6.1
pydantic_core==2.23.4
PyJWT==2.9.0
pyspark==3.2.2
pytest==8.3.3
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
pytz==2024.2
pywin32==308
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
SQLAlchemy==2.0.35
starlette==0.41.2
tenacity==9.0.0
thrift==0.20.0
tiktoken==0.8.0
tomli==2.0.2
tqdm==4.67.0
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.2
urllib3==2.2.3
uvicorn==0.32.0
yarl==1.17.1

The text was updated successfully, but these errors were encountered:

keenborder786 · 2024-12-17T12:00:36Z

@KalakondaSainath I would suggest you use LANGSMITH to trace your chain, this way you will be better able to break down the latency coming from each runnable in chain.

rayamoh · 2024-12-17T14:37:53Z

@keenborder786 - Bringing LANGSMITH to our org is big task going through approvals and buy the product. Can we get inputs on why there is gap of 2.5 mins from below run, between ClientSecret token and Retry request. Any additional logs we can enable in langchain to understand how retry logic "wait_exponential" is being used in create_base_retry_decorator

@KalakondaSainath

keenborder786 · 2024-12-18T09:56:21Z

@rayamoh I think so the best solution in your case will be to write a custom Call Back Handler which keep track of time tracking and then use it in chain:

prompt = PromptTemplate.from_template("Answer the following question: {question}")
chain = prompt | ChatOpenAI()
chain.invoke(input = {'question': 'What is LangChain?'}, config=RunnableConfig(callbacks=[StdOutCallbackHandler(), StreamingStdOutCallbackHandler()]))

dosubot bot added the Ɑ: core Related to langchain-core label Dec 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observered latency in the chain.invoke #28750

Observered latency in the chain.invoke #28750

KalakondaSainath commented Dec 16, 2024

keenborder786 commented Dec 17, 2024

rayamoh commented Dec 17, 2024

keenborder786 commented Dec 18, 2024

Observered latency in the chain.invoke #28750

Observered latency in the chain.invoke #28750

Comments

KalakondaSainath commented Dec 16, 2024

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

keenborder786 commented Dec 17, 2024

rayamoh commented Dec 17, 2024

keenborder786 commented Dec 18, 2024