The scope of this Hyperledger Labs project is to support the users (users, developer, etc.) to their work, avoiding to wade through oceans of documents to find information they are looking for. We are implementing an open source conversational AI tool which replies to the questions related to specific context. This is a prototype which allows to create a chatbot running a RESTful API which requires GPU. Here the official Wiki pages: Hyperledger Labs aifaq and Hyperledger Labs wiki. Please, read also the Antitrust Policy and the Code of Conduct. Every Monday we have a public meeting and the invitation is on the Hyperledger Labs calendar: [Hyperledger Labs] FAQ AI Lab calls.
The system is an open source python project which implements an AI chatbot that replies to HTTP requests. The idea is to implement an open source framework/template, as example, for other communities/organizations/companies. Last results in open LLMs allow to have good performance using common HW resources.
Below the application architecture:
We use RAG (Retrieval Augmented Generation arxiv.org) for question answering use case. That technique aims to improve LLM answers by incorporating knowledge from external database (e.g. vector database).
The image depicts two workflow:
- The data ingestion workflow
- The chat workflow
During the ingestion phase, the system loads context documents and creates a vector database. For example, the document sources can be:
- An online software guide (readthedocs template)
- The GitHub issues and pull requests
In our case, they are the readthedocs guide and a wiki page.
After the first phase, the system is ready to reply to user questions.
Currently, we use the open source HuggingFace Zephyr-7b-beta, and in the future we want to investigate other open source models.
The user can query the system using HTTP requests, but we want to supply UI samples, as external module.
The software is under Apache 2.0 License (please check LICENSE and NOTICE files included). We use some 3rd party libraries: here is the ASF 3rd Party License Policy and here is the information for (Assembling LICENSE and NOTICE files).
This document does not contain commercial advertisement: all the tools/products/books/materials are generic and you have to consider those as examples!
This software needs GPU for the execution: if you do not have a local GPU you could use a Cloud GPU. There are several solutions to use Cloud GPU:
- Cloud Provider (AWS, GCP, ...)
- On-Demand GPU Cloud (vast.ai, RunPod, ...)
- Cloud GPU IDE
Currently, I use a Cloud GPU IDE (Lightning Studio), after signup/login, create new Studio (project):
select the left solution:
click on the Start button, and rename the new Studio:
Then, and copy-paste the github api repo code:
and create two folders:
- chromadb (it will contains vector database files)
- rtdocs (it will contains the ReadTheDocs documentation)
That version works with Hyperledger fabric documents (Wiki and ReadTheDocs).
Open a new terminal:
and download the documentation executing the command below:
wget -r -A.html -P rtdocs https://hyperledger-fabric.readthedocs.io/en/release-2.5/
actually, after a minute we can interrupt (CTRL + C) because it starts to download previous versions:
Now, we can move into rtdocs folder and move the release-2.5 content to rtdocs. We need to compress the content of the folder, moving there and use that command:
and move the readthedocs.tar.gz to the parent directory (../):
- mv readthedocs.tar.gz ..
- cd ..
repeating the two commands until we are into rtdocs folder:
now remove hyperledger… folder and the content:
uncompress the file here and remove compress file:
- tar -xzvf rtdocs.tar.gz
- rm rtdocs.tar.gz
Move to the parent folder and execute the command below:
pip install -r requirements.txt
After the requirements installation we can switch to GPU before to execute the ingestion script:
then select the L4 solution:
and confirm (it takes some minutes).
Run the ingest.py script:
it will create content in chromadb folder.
Now, we can run the API and test it. So, run api.py script:
and test it:
curl --header "Content-Type: application/json" --request POST --data '{"text": "How to install Hyperledger fabric?"}' http://127.0.0.1:8080/query
below the result:
That is a proof-of-concept: a list of future improvement below:
- This is the first version of the prototype and it will be installed on a GPU Cloud Server
- At the same time, we'd like to pass to the next step: the Hyperledger Incubation Stage
- We will investigate other open source models
- Evaluation of the system using standard metrics
- We would like to improve the system, some ideas are: fine-tuning, Advanced RAG, Decomposed LoRA
- Add "guardrails" which are a specific ways of controlling the output of a LLM, such as talking avoid specific topics, responding in a particular way to specific user requests, etc.