GenAI Stack @ THM

This project is based on Neo4j Genai Stack which bundles Neo4j, Langchain, Ollama and Streamlit into a docker compose environment. We amended a bunch of changes to stay up to date with library upgrades and also to support multiple databases.

Installation

GPU

Depending on the type of GPU the installation slightly differs. Technically this is handled by Docker Compose service profiles.

Nvidia GPU

Install Nvidia's container toolkit. On a Ubuntu/Debian based os run the following commands

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo nvidia-ctk config --set nvidia-container-cli.no-cgroups --in-place

After a full reboot, check that nvidia gpus are detected properly using nvidia-smi.

Ensure you have set export COMPOSE_PROFILES=linux-gpu-nvidia, e.g. in ~/.profile.

AMD GPU

Ensure you have set export COMPOSE_PROFILES=linux-gpu-amd, e.g. in ~/.profile. Depending on your GPU you might need to tweak HSA_OVERRIDE_GFX_VERSION.

On a Mac computer

In case you're using OLLAMA install it directly on the machine. Download the model with ollama pull llama3.2 and start OLLAMA with ollama serve.

configuration files

Once the GPU dependent part is done, copy env.example to .env and configure it to your needs. It's important to set the uid/gid to the user running the Docker commands. You might want to use the id command to figure out your uid/gid. The first two settings LLM and EMBEDDING_MODEL should be set according to your needs. NEO4J_PASSWORD should be set to a reasonable password. Note that the password is only applied if the data folder is empty. You cannot change the database password by simply changing it in .env. Instead follow the procedure from Recover admin user and password.

For authentication of the loader container copy build-context/auth.yaml.example to build-context/auth.yaml and apply your changes - esp. change the password.

download database dump

The database dump for regesta imperii is not part of the github repository, therefore download it:

cd backups
./download.sh

seeding the database

To have an inital dataset for neo4j, you can use a dump files. Since we're using the community edition of Neo4j the database name is fixed to neo4j.

Warning

Be aware that seeding will not happen if the respective database already exists.

start the containers

Use docker compose up -d --build to build and run the containers. With the usual suspect docker compose ps and docker compose logs progress can be monitored.

usage

The bot container is a interactive chat bot application using streamlit. loader is a container to vectorize the graph data. This is intended to be a one-shot operation. The api container exposes the search as a REST interface. A swagger UI for API documentation is available as well.

	regestaimperii
bot	http://localhost:8601
loader	http://localhost:8602
api	http://localhost:8604/docs

Note

The loader container is password protected, the password is located in your auth.yml file.

Original readme below:

GenAI Stack

The GenAI Stack will get you started building your own GenAI application in no time. The demo applications can serve as inspiration or as a starting point. Learn more about the details in the technical blog post.

Configure

Create a .env file from the environment template file env.example

Available variables:

Variable Name	Default value	Description
OLLAMA_BASE_URL	http://host.docker.internal:11434	REQUIRED - URL to Ollama LLM API
NEO4J_URI	neo4j://database:7687	REQUIRED - URL to Neo4j database
NEO4J_USERNAME	neo4j	REQUIRED - Username for Neo4j database
NEO4J_PASSWORD	password	REQUIRED - Password for Neo4j database
LLM	llama2	REQUIRED - Can be any Ollama model tag, or gpt-4 or gpt-3.5 or claudev2
EMBEDDING_MODEL	sentence_transformer	REQUIRED - Can be sentence_transformer, openai, aws, ollama or google-genai-embedding-001
AWS_ACCESS_KEY_ID		REQUIRED - Only if LLM=claudev2 or embedding_model=aws
AWS_SECRET_ACCESS_KEY		REQUIRED - Only if LLM=claudev2 or embedding_model=aws
AWS_DEFAULT_REGION		REQUIRED - Only if LLM=claudev2 or embedding_model=aws
OPENAI_API_KEY		REQUIRED - Only if LLM=gpt-4 or LLM=gpt-3.5 or embedding_model=openai
GOOGLE_API_KEY		REQUIRED - Only required when using GoogleGenai LLM or embedding model google-genai-embedding-001
LANGCHAIN_ENDPOINT	"https://api.smith.langchain.com"	OPTIONAL - URL to Langchain Smith API
LANGCHAIN_TRACING_V2	false	OPTIONAL - Enable Langchain tracing v2
LANGCHAIN_PROJECT		OPTIONAL - Langchain project name
LANGCHAIN_API_KEY		OPTIONAL - Langchain API key

LLM Configuration

MacOS and Linux users can use any LLM that's available via Ollama. Check the "tags" section under the model page you want to use on https://ollama.ai/library and write the tag for the value of the environment variable LLM= in the .env file. All platforms can use GPT-3.5-turbo and GPT-4 (bring your own API keys for OpenAI models).

MacOS Install Ollama on MacOS and start it before running docker compose up using ollama serve in a separate terminal.

Linux No need to install Ollama manually, it will run in a container as part of the stack when running with the Linux profile: run docker compose --profile linux up. Make sure to set the OLLAMA_BASE_URL=http://llm:11434 in the .env file when using Ollama docker container.

To use the Linux-GPU profile: run docker compose --profile linux-gpu up. Also change OLLAMA_BASE_URL=http://llm-gpu:11434 in the .env file.

Windows Ollama now supports Windows. Install Ollama on Windows and start it before running docker compose up using ollama serve in a separate terminal. Alternatively, Windows users can generate an OpenAI API key and configure the stack to use gpt-3.5 or gpt-4 in the .env file.

Develop

Warning

There is a performance issue that impacts python applications in the 4.24.x releases of Docker Desktop. Please upgrade to the latest release before using this stack.

To start everything

docker compose up

If changes to build scripts have been made, rebuild.

docker compose up --build

To enter watch mode (auto rebuild on file changes). First start everything, then in new terminal:

docker compose watch

Shutdown If health check fails or containers don't start up as expected, shutdown completely to start up again.

docker compose down

Applications

Here's what's in this repo:

Name	Main files	Compose name	URLs	Description
Support Bot	`bot.py`	`bot`	http://localhost:8501	Main usecase. Fullstack Python application.
Stack Overflow Loader	`loader.py`	`loader`	http://localhost:8502	Load SO data into the database (create vector embeddings etc). Fullstack Python application.
PDF Reader	`pdf_bot.py`	`pdf_bot`	http://localhost:8503	Read local PDF and ask it questions. Fullstack Python application.
Standalone Bot API	`api.py`	`api`	http://localhost:8504	Standalone HTTP API streaming (SSE) + non-streaming endpoints Python.
Standalone Bot UI	`front-end/`	`front-end`	http://localhost:8505	Standalone client that uses the Standalone Bot API to interact with the model. JavaScript (Svelte) front-end.

The database can be explored at http://localhost:7474.

App 1 - Support Agent Bot

UI: http://localhost:8501 DB client: http://localhost:7474

answer support question based on recent entries
provide summarized answers with sources
demonstrate difference between
- RAG Disabled (pure LLM response)
- RAG Enabled (vector + knowledge graph context)
allow to generate a high quality support ticket for the current conversation based on the style of highly rated questions in the database.

(Chat input + RAG mode selector)



(CTA to auto generate support ticket draft)	(UI of the auto generated support ticket draft)

App 2 - Loader

UI: http://localhost:8502 DB client: http://localhost:7474

import recent Stack Overflow data for certain tags into a KG
embed questions and answers and store them in vector index
UI: choose tags, run import, see progress, some stats of data in the database
Load high ranked questions (regardless of tags) to support the ticket generation feature of App 1.

App 3 Question / Answer with a local PDF

UI: http://localhost:8503
DB client: http://localhost:7474

This application lets you load a local PDF into text chunks and embed it into Neo4j so you can ask questions about its contents and have the LLM answer them using vector similarity search.

App 4 Standalone HTTP API

Endpoints:

http://localhost:8504/query?text=hello&rag=false (non streaming)
http://localhost:8504/query-stream?text=hello&rag=false (SSE streaming)

Example cURL command:

curl http://localhost:8504/query-stream\?text\=minimal%20hello%20world%20in%20python\&rag\=false

Exposes the functionality to answer questions in the same way as App 1 above. Uses same code and prompts.

App 5 Static front-end

UI: http://localhost:8505

This application has the same features as App 1, but is built separate from the back-end code using modern best practices (Vite, Svelte, Tailwind).
The auto-reload on changes are instant using the Docker watch sync config.

Name		Name	Last commit message	Last commit date
Latest commit History 220 Commits
.github/media		.github/media
backups		backups
build-context		build-context
data		data
embedding_model		embedding_model
front-end		front-end
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
create_embeddings.ipynb		create_embeddings.ipynb
docker-compose.yml		docker-compose.yml
env.example		env.example
front-end.Dockerfile		front-end.Dockerfile
install_ollama.sh		install_ollama.sh
ollama_entrypoint.sh		ollama_entrypoint.sh
readme.md		readme.md
requirements.txt		requirements.txt
running_on_wsl.md		running_on_wsl.md
seed-databases.sh		seed-databases.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenAI Stack @ THM

Installation

GPU

Nvidia GPU

AMD GPU

On a Mac computer

configuration files

download database dump

seeding the database

start the containers

usage

GenAI Stack

Configure

LLM Configuration

Develop

Applications

App 1 - Support Agent Bot

App 2 - Loader

App 3 Question / Answer with a local PDF

App 4 Standalone HTTP API

App 5 Static front-end

About

Releases

Packages

Contributors 29

Languages

License

THM-Graphs/genai-stack

Folders and files

Latest commit

History

Repository files navigation

GenAI Stack @ THM

Installation

GPU

Nvidia GPU

AMD GPU

On a Mac computer

configuration files

download database dump

seeding the database

start the containers

usage

GenAI Stack

Configure

LLM Configuration

Develop

Applications

App 1 - Support Agent Bot

App 2 - Loader

App 3 Question / Answer with a local PDF

App 4 Standalone HTTP API

App 5 Static front-end

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 29

Languages

Packages