- Available Vector Databases
- Configuring Milvus with GPU Acceleration
- Configuring pgvector as the Vector Database
- Configuring Support for an External Milvus or pgvector database
- Adding a New Vector Store
By default, the Docker Compose files for the examples deploy Milvus as the vector database with CPU-only support. You must install the NVIDIA Container Toolkit to use Milvus with GPU acceleration.
The available vector databases in the examples are shown in the following list:
- LlamaIndex: Milvus, pgvector
- LangChain: FAISS, Milvus, pgvector
The following customizations are common:
- Use Milvus with GPU acceleration.
- Use pgvector as an alternative to Milvus. pgvector uses CPU only.
- Use your own vector database and prevent deploying a vector database with each RAG example.
-
Edit the
RAG/examples/local_deploy/docker-compose-vectordb.yaml
file and make the following changes to the Milvus service.-
Change the image tag to include the
-gpu
suffix:milvus: container_name: milvus-standalone image: milvusdb/milvus:v2.4.5-gpu ...
-
Add the GPU resource reservation:
... depends_on: - "etcd" - "minio" deploy: resources: reservations: devices: - driver: nvidia capabilities: ["gpu"] device_ids: ['${VECTORSTORE_GPU_DEVICE_ID:-0}'] profiles: ["nemo-retriever", "milvus", ""]
-
-
Stop and start the containers:
docker compose down docker compose up -d --build
Note: when deploying milvus with
local-nim
you have to usemilvus
profile to deploy the vectorstoredocker compose --profile local-nim --profile milvus up -d --build
-
Optional: View the chain server logs to confirm the vector database is operational.
-
View the logs:
docker logs -f chain-server
-
Upload a document to the knowledge base. Refer to Use Unstructured Documents as a Knowledge Base for more information.
-
Confirm the log output includes the vector database:
INFO:RAG.src.chain_server.utils:Using milvus collection: nvidia_api_catalog INFO:RAG.src.chain_server.utils:Vector store created and saved.
-
-
Export the following environment variables in your terminal:
export POSTGRES_PASSWORD=password export POSTGRES_USER=postgres export POSTGRES_DB=api
-
Edit the
docker-compose.yaml
file for the RAG example and set the following environment variables for the Chain Server:environment: APP_VECTORSTORE_URL: "pgvector:5432" APP_VECTORSTORE_NAME: "pgvector" POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-password} POSTGRES_USER: ${POSTGRES_USER:-postgres} POSTGRES_DB: ${POSTGRES_DB:-api} ...
-
Start the containers:
docker compose --profile pgvector up -d --build
-
Optional: View the chain server logs to confirm the vector database is operational.
-
View the logs:
docker logs -f chain-server
-
Upload a document to the knowledge base. Refer to Use Unstructured Documents as a Knowledge Base for more information.
-
Confirm the log output includes the vector database:
INFO:RAG.src.chain_server.utils:Using PGVector collection: nvidia_api_catalog INFO:RAG.src.chain_server.utils:Vector store created and saved.
-
To stop pgvector and the other containers run docker compose --profile pgvector down
.
-
Edit the
docker-compose.yaml
file for the RAG example and make the following edits.-
Remove or comment the
include
path to thedocker-compose-vectordb.yaml
file:include: - path: # - ../../local_deploy/docker-compose-vectordb.yaml - ../../local_deploy/docker-compose-nim-ms.yaml
-
To use an external Milvus server, specify the connection information:
environment: APP_VECTORSTORE_URL: "http://<milvus-hostname-or-ipaddress>:19530" APP_VECTORSTORE_NAME: "milvus" ...
-
To use an external pgvector server, specify the connection information:
environment: APP_VECTORSTORE_URL: "<pgvector-hostname-or-ipaddress>:5432" APP_VECTORSTORE_NAME: "pgvector" ...
Also export the
POSTGRES_PASSWORD
,POSTGRES_USER
, andPOSTGRES_DB
environment variables in your terminal.
-
-
Start the containers:
docker compose up -d --build
You can extend the code to add support for any vector store.
-
Navigate to the file
RAG/src/chain_server/utils.py
from the project's root directory. This file contains the utility functions used for vector store interactions. -
Modify the
get_vector_index
function to handle your new vector store. Implement the logic for creating your vector store object within this function.def get_vector_index(): # existing code elif config.vector_store.name == "chromadb": import chromadb from llama_index.vector_stores.chroma import ChromaVectorStore if not collection_name: collection_name = os.getenv('COLLECTION_NAME', "vector_db") logger.info(f"Using Chroma collection: {collection_name}") chroma_client = chromadb.EphemeralClient() chroma_collection = chroma_client.create_collection(collection_name) vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
-
Modify the
get_docs_vectorstore_llamaindex
function to retrieve the list of files stored in your new vector store.def get_docs_vectorstore_llamaindex(): # existing code elif settings.vector_store.name == "chromadb": ref_doc_info = index.ref_doc_info # iterate over all the document in vectorstore and return unique filename for _ , ref_doc_value in ref_doc_info.items(): metadata = ref_doc_value.metadata if 'filename' in metadata: filename = metadata['filename'] decoded_filenames.append(filename) decoded_filenames = list(set(decoded_filenames))
-
Update the
del_docs_vectorstore_llamaindex
function to handle document deletion in your new vector store.def del_docs_vectorstore_llamaindex(filenames: List[str]): # existing code elif settings.vector_store.name == "chromadb": ref_doc_info = index.ref_doc_info # Iterate over all the filenames and if filename present in metadata of doc delete it for filename in filenames: for ref_doc_id, doc_info in ref_doc_info.items(): if 'filename' in doc_info.metadata and doc_info.metadata['filename'] == filename: index.delete_ref_doc(ref_doc_id, delete_from_docstore=True) logger.info(f"Deleted documents with filenames {filename}")
-
In your custom
chains.py
implementation, import the functions fromutils.py
. The samplechains.py
inRAG/examples/basic_rag/llamaindex
already imports the functions.from RAG.src.chain_server.utils import ( get_vector_index, get_docs_vectorstore_llamaindex, del_docs_vectorstore_llamaindex, )
-
Update
RAG/src/chain_server/requirements.txt
with any additional package required for the vector store.# existing dependency llama-index-vector-stores-chroma
-
Build and start the containers.
-
Navigate to the example directory.
cd RAG/examples/basic_rag/llamaindex
-
Set the
APP_VECTORSTORE_NAME
environment variable for thechain-server
microservice in yourdocker-compose.yaml
file. Set it to the name of your newly added vector store.APP_VECTORSTORE_NAME: "chromadb"
-
Build and deploy the microservice.
docker compose up -d --build chain-server rag-playground
-
-
Navigate to the file
RAG/src/chain_server/utils.py
in the project's root directory. -
Modify the
create_vectorstore_langchain
function to handle your new vector store. Implement the logic for creating your vector store object within it.def create_vectorstore_langchain(document_embedder, collection_name: str = "") -> VectorStore: # existing code elif config.vector_store.name == "chromadb": from langchain_chroma import Chroma import chromadb logger.info(f"Using Chroma collection: {collection_name}") persistent_client = chromadb.PersistentClient() vectorstore = Chroma( client=persistent_client, collection_name=collection_name, embedding_function=document_embedder, )
-
Update the
get_docs_vectorstore_langchain
function to retrieve a list of documents from your new vector store. Implement your retrieval logic within it.def get_docs_vectorstore_langchain(vectorstore: VectorStore) -> List[str]: # Existing code elif settings.vector_store.name == "chromadb": chroma_data = vectorstore.get() filenames = set([extract_filename(metadata) for metadata in chroma_data.get("metadatas", [])]) return filenames
-
Update the
del_docs_vectorstore_langchain
function to handle document deletion in your new vector store.def del_docs_vectorstore_langchain(vectorstore: VectorStore, filenames: List[str]) -> bool: # Existing code elif settings.vector_store.name == "chromadb": chroma_data = vectorstore.get() for filename in filenames: ids_list = [chroma_data.get("ids")[idx] for idx, metadata in enumerate(chroma_data.get("metadatas", [])) if extract_filename(metadata) == filename] vectorstore.delete(ids_list) return True
-
In your custom
chains.py
implementation, import the preceding functions fromutils.py
. The samplechains.py
inRAG/examples/basic_rag/langchain
already imports the functions.from RAG.src.chain_server.utils import ( create_vectorstore_langchain, get_docs_vectorstore_langchain, del_docs_vectorstore_langchain, get_vectorstore )
-
Update
RAG/src/chain_server/requirements.txt
with any additional package required for the vector store.# existing dependency langchain-core==0.1.40 # Update this dependency as there is conflict with existing one langchain-chroma
-
Build and start the containers.
-
Navigate to the example directory.
cd RAG/examples/basic_rag/langchain
-
Set the
APP_VECTORSTORE_NAME
environment variable for thechain-server
microservice in yourdocker-compose.yaml
file. Set it to the name of your newly added vector store.APP_VECTORSTORE_NAME: "chromadb"
-
Build and deploy the microservices.
docker compose up -d --build chain-server rag-playground
-