Personal chatbot using LLMs currently supporting huggingface API and Nvidia API with LightRAG implementation for RAG, and stable diffusion for image generation with Lora for image enhancement
The chatbot code are using OOP method which makes this implementation much flexible for any kind of LLMs API and local running model (future work for demo)
For implementation example, please refer to chabot.py
LightRAG documentation: click here
Supported Nvidia Model: click here
Lora and SD checkpoint: click here
- Large file analysis would make the input token much larger and thus raising an error.
- Long chat history also makes input token much larger
- Because all of the task performed by LLM, the performance of analysis and RAG are heavily depends on LLM model you're using and text-embedding model for retrieval on RAG.
- In default, this code use llama 70b with limited strong-detailed analysis compared to human.
- RAG method are using LightRAG
- Install dependencies
pip install -r requirements.txt
- Make sure you've cloned the latest LightRAG sub-module used in this repo. Using command
git clone https://github.com/HKUDS/LightRAG.git
- Put Lora and SD checkpoint inside stable-diffusion/models corresponding folder. However, it's customizable on your needs but you'll have to modify the streamlit_app.py code.
- Insert your API key inside the
bot.set_api_key("nvapi-xxxx")
to use nvidia NIM orbot.set_api_key("hf_xxxx")
to use hf model. - To run streamlit GUI, run this on your environment command line or prompt:
streamlit run streamlit_app.py
- Send
/relearn
command to chatbot to begin re-learning on every file inside data folder - Use command
/rag what-to-do
to use RAG in chatbot - To perform analysis, simply upload file into chatbot and send this syntax:
/analyze filename.extension what-to-do
or/analyze filename.extension
for general information analysis
- To save an uploaded file into /data files for RAG, simply use
/save
command and chatbot will automatically re-learn without using/relearn
command. - To perform image generation, use english word like
generate me an image of..
ormake me an image..
. Currently, the algorithm works by detecting the patterns of user prompt in english.
Note: Performance depends on the type of LLM. The cut-off part of the chat is influenced by the limitations of the LLM used in the screenshot.