Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFaceEndpoint returning buggy responses and prompt template back #28572

Open
5 tasks done
myke11j opened this issue Dec 6, 2024 · 0 comments
Open
5 tasks done
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation.

Comments

@myke11j
Copy link

myke11j commented Dec 6, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

llm = HuggingFaceEndpoint(
    repo_id= 'meta-llama/Llama-3.1-8B-Instruct',
    huggingfacehub_api_token=api_token
)
llm_chain = (
    {
        "context": lambda inputs: retrieve_context(inputs['question'], inputs['vector']),
        "question": RunnablePassthrough()
    }
    | PromptTemplate(
        template=BASE_TEMPLATE,
        input_variables=["context", "question"]
    )
    | llm
    | StrOutputParser()
)

llm_chain.invoke({ 'question': user_query, 'vector': vector })

Error Message and Stack Trace (if applicable)

As you can see in langsmith, it returned this output.

image

Description

I'm using HuggingFaceEndpoint for inference to avoid storing model on my local machine, and I've noticed it gives buggy responses quite a few times. I'm using it for a RAG and a lot of times it just returns back the entire base prompt template inside [INST]...[/INST]. And as seen in screenshot attached, it returned "[/INST]" in a loop until max tokens limit reached.

System Info

System Information

OS: Darwin
OS Version: Darwin Kernel Version 22.4.0: Mon Mar 6 21:00:17 PST 2023; root:xnu-8796.101.5~3/RELEASE_X86_64
Python Version: 3.12.6 (main, Sep 6 2024, 19:03:47) [Clang 15.0.0 (clang-1500.1.0.2.5)]

Package Information

langchain_core: 0.3.19
langchain: 0.3.7
langchain_community: 0.3.7
langsmith: 0.1.143
langchain_huggingface: 0.1.2
langchain_text_splitters: 0.3.2
langchainhub: 0.1.21

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.11.4
async-timeout: Installed. No version info available.
dataclasses-json: 0.6.7
httpx: 0.27.2
httpx-sse: 0.4.0
huggingface-hub: 0.26.2
jsonpatch: 1.33
numpy: 1.26.4
orjson: 3.10.11
packaging: 24.2
pydantic: 2.9.2
pydantic-settings: 2.6.1
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
sentence-transformers: 3.3.1
SQLAlchemy: 2.0.35
tenacity: 9.0.0
tokenizers: 0.20.3
transformers: 4.46.3
types-requests: 2.32.0.20241016
typing-extensions: 4.12.2

@langcarl langcarl bot added the investigate Flagged for investigation. label Dec 6, 2024
@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature investigate Flagged for investigation.
Projects
None yet
Development

No branches or pull requests

1 participant