Due to a recent change in Pinecone's free account offerings, we can now only create index of type serverless
. Previously, it was possible to create index of type pods
. As a result, the code for uploading and retrieving vectors using LangChain's previous version is now failing.
#23571
Unanswered
Shaileshv1331
asked this question in
Q&A
Replies: 1 comment
-
Hey there, @Shaileshv1331! I'm here to help you with any bugs, questions, or contributions you might need. Let's work together to solve this issue! To update your code to use Pinecone's serverless indexes for uploading and retrieving vectors with LangChain, you can follow the example provided below. This approach aligns with the recent changes in Pinecone's free account offerings. Updated Approach (Serverless Index)Uploading Vectors to Pinecone Serverless Indexfrom langchain.document_loaders import DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_pinecone import PineconeVectorStore
from langchain_openai import OpenAIEmbeddings
import os
# Set environment variables
os.environ['OPENAI_API_KEY'] = openai_api_key
os.environ['PINECONE_API_KEY'] = pinecone_api_key
# Initialize embeddings
embeddings = OpenAIEmbeddings()
# Load and split documents
def load_docs(directory):
loader = DirectoryLoader(directory, glob='**/*.pdf')
documents = loader.load()
print(documents)
return documents
def split_docs(documents, chunk_size=500, chunk_overlap=200):
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
docs = text_splitter.split_documents(documents)
return docs
documents = load_docs(data_path)
docs = split_docs(documents)
# Create Pinecone vector store from documents
vectorstore_from_docs = PineconeVectorStore.from_documents(
docs,
index_name=index_name,
embedding=embeddings
) Retrieving Similar Documentsfrom langchain_pinecone import PineconeVectorStore
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
# Initialize LLM and Pinecone vector store
llm = ChatOpenAI(temperature=0.0, model_name=model_name, openai_api_key=openai_api_key)
index = PineconeVectorStore(index_name=index_name, embedding=embeddings)
# Create a conversational retrieval chain
qa = ConversationalRetrievalChain.from_llm(llm=llm, retriever=index.as_retriever(top_k=10)) This updated approach ensures compatibility with Pinecone's serverless indexes and uses the latest versions of LangChain, specifically |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked other resources
Commit to Help
Example Code
Previous Approach (POD-based Index)
Updated Approach (Serverless Index)
Description
Around 8-9 months ago, you might have used the code (Previous Approach code)to upload vectors to a Pinecone index and retrieve similar documents. This approach worked well with the Pinecone free account at that time.(JFYI I used langchain version 0.0.237 at that time)
Issue Encountered
If you use the that code now, you might encounter an error due to changes in Pinecone's free account, which now only supports serverless indexes. The error stack trace might look like this:
This error occurred due to changes in the types of indexes supported by the free account. Unfortunately, the POD-based index, which was previously used, is no longer supported under the current free account plan.
To align with the new serverless index type, it's necessary to revise your code using the latest versions of LangChain, specifically utilizing
langchain_pinecone.PineconeVectorStore
for uploading vectors andlangchain_openai.OpenAIEmbeddings
for retrieval. The updated approach I mentioned above, tested with LangChain version 0.1.9, ensures successful vector uploading and retrieval operations.Key Changes
langchain_pinecone.PineconeVectorStore
andlangchain_openai.OpenAIEmbeddings
to handle the serverless index.PineconeVectorStore.from_documents
.ConversationalRetrievalChain
andPineconeVectorStore
.Conclusion
By following this updated approach, you can smoothly transition to using Pinecone’s serverless indexes with LangChain for both uploading vectors and retrieving similar documents. This ensures that your workflow remains efficient and compatible with Pinecone's latest offerings. This guide helps the LangChain and NLP community adapt to changes in Pinecone’s index types, leveraging the latest features in LangChain for a robust solution to managing and querying document embeddings.
Feel free to implement these changes and share your experiences with the community!
System Info
Ubuntu 22.04.4 LTS
python==3.10.12
Beta Was this translation helpful? Give feedback.
All reactions