You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Retrying request to /chat/completions in 0.376881 seconds
Description
We have observed intermittent latency in the chain.invoke
At times it takes couple of minutes before it makes Open AI HTTP Post Request and there is no logging on the operation taking time.
With the same payload, at times the whole chain completes in 10 seconds where as we observe the latency with no logging and Error message Retrying request to /chat/completions in
We would like to understand the issue with langchain invoke on why there is latency observed in the
@KalakondaSainath I would suggest you use LANGSMITH to trace your chain, this way you will be better able to break down the latency coming from each runnable in chain.
@keenborder786 - Bringing LANGSMITH to our org is big task going through approvals and buy the product. Can we get inputs on why there is gap of 2.5 mins from below run, between ClientSecret token and Retry request. Any additional logs we can enable in langchain to understand how retry logic "wait_exponential" is being used in create_base_retry_decorator
@rayamoh I think so the best solution in your case will be to write a custom Call Back Handler which keep track of time tracking and then use it in chain:
prompt=PromptTemplate.from_template("Answer the following question: {question}")
chain=prompt|ChatOpenAI()
chain.invoke(input= {'question': 'What is LangChain?'}, config=RunnableConfig(callbacks=[StdOutCallbackHandler(), StreamingStdOutCallbackHandler()]))
Checked other resources
Example Code
import os
from azure.identity import ClientSecretCredential, get_bearer_token_provider
from langchain_openai import AzureChatOpenAI
def llm_connection(model="gpt-4o",
temperature=0.2,
top_p=0.1,
max_tokens=2000,
max_retries=1):
credentials = {
"tenant_id": os.getenv("AZURE_TENANT_ID"),
"client_id": os.getenv("AZURE_CLIENT_ID"),
"client_secret": os.getenv("AZURE_CLIENT_SECRET"),
"openai_endpoint": os.getenv("API_BASE"),
"azure_api_version": os.getenv("AZURE_API_VERSION", "2024-04-01-preview"),
"subscription_key": os.getenv("SUBSCRIPTION_KEY"),
}
def instantiate_llm(
credentials: dict,
azure_deployment: str = "gpt-4o",
temperature: float = 0.2,
top_p=0.1,
max_tokens: int = 1000,
max_retries: int = 1,
):
"""
Instantiate llm model
"""
prompt_template = ChatPromptTemplate.from_messages(
[("system", EMAIL_WRITER_SYSTEM)]
)
chain = prompt_template | llm | StrOutputParser()
response = chain.invoke(<request_payload>)
Error Message and Stack Trace (if applicable)
Retrying request to /chat/completions in 0.376881 seconds
Description
We have observed intermittent latency in the chain.invoke
At times it takes couple of minutes before it makes Open AI HTTP Post Request and there is no logging on the operation taking time.
With the same payload, at times the whole chain completes in 10 seconds where as we observe the latency with no logging and Error message Retrying request to /chat/completions in
We would like to understand the issue with langchain invoke on why there is latency observed in the
Note: both the requests are with same payload
Latency observed during this sequence of steps
No latency during this call -
System Info
$ pip freeze
aiohappyeyeballs==2.4.3
aiohttp==3.10.10
aiosignal==1.3.1
annotated-types==0.7.0
anyio==4.6.2.post1
async-timeout==4.0.3
attrs==24.2.0
azure-core==1.32.0
azure-functions==1.21.3
azure-identity==1.19.0
beautifulsoup4==4.12.3
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.4.0
click==8.1.7
colorama==0.4.6
coverage==7.6.4
cryptography==43.0.3
databricks-sql-connector==3.4.0
dataclasses-json==0.6.7
distro==1.9.0
et_xmlfile==2.0.0
exceptiongroup==1.2.2
fastapi==0.115.4
frozenlist==1.5.0
greenlet==3.1.1
h11==0.14.0
httpcore==1.0.6
httpx==0.27.2
httpx-sse==0.4.0
idna==3.10
iniconfig==2.0.0
isodate==0.7.2
jiter==0.7.0
jsonpatch==1.33
jsonpointer==3.0.0
langchain==0.3.7
langchain-community==0.3.5
langchain-core==0.3.15
langchain-openai==0.2.6
langchain-text-splitters==0.3.2
langsmith==0.1.140
lxml==5.3.0
lz4==4.3.3
marshmallow==3.23.1
msal==1.31.0
msal-extensions==1.2.0
msrest==0.7.1
multidict==6.1.0
mypy-extensions==1.0.0
numpy==1.26.4
oauthlib==3.2.2
openai==1.54.3
openpyxl==3.1.5
orjson==3.10.11
packaging==24.1
pandas==2.2.0
pluggy==1.5.0
portalocker==2.10.1
propcache==0.2.0
py4j==0.10.9.5
pyarrow==16.1.0
pycparser==2.22
pydantic==2.9.2
pydantic-settings==2.6.1
pydantic_core==2.23.4
PyJWT==2.9.0
pyspark==3.2.2
pytest==8.3.3
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
pytz==2024.2
pywin32==308
PyYAML==6.0.2
regex==2024.11.6
requests==2.32.3
requests-oauthlib==2.0.0
requests-toolbelt==1.0.0
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
SQLAlchemy==2.0.35
starlette==0.41.2
tenacity==9.0.0
thrift==0.20.0
tiktoken==0.8.0
tomli==2.0.2
tqdm==4.67.0
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.2
urllib3==2.2.3
uvicorn==0.32.0
yarl==1.17.1
The text was updated successfully, but these errors were encountered: