-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I run the Ollama [Bug]: #2777
Comments
Hey @FaizanMunsaf! I'm here to help you with the Ollama issue you're facing. I'm a bot designed to assist with bugs, questions, and contributions. Let's work on resolving this together. To run the Ollama private model using the current repository, follow these steps:
By following these steps, you will have the Ollama model running and integrated with Quivr [1][2][3]. |
real problem is getting it to work with llama3. how do we do that? |
how to run with llama3 |
If you look at the "brains" table there is a "model" column and they don't give you a way to change the model initially - the code will default to gpt-3.5-turbo-0125 if this value is NULL. so put "ollama/llama2" in the row for your brain under the model column. unfortunately i get a new error, openai.NotFoundError: 404 page not found... it looks like it is still trying to use openAI for some reason. edit: the reason was the API base url is different for actually chatting with the bot. look at my post below for the solution. i don't know if this was a coincidence because ollama2 API is now compatible with openai format. And at first glance it looks like the code is still hardcoded to use ChatOpenAI, see backend/core/quivr_core/llm/llm_endpoint.py it works anyways, hopefully the authors don't intentionally keep breaking the ollama functionality. |
I got further along, the 404 error is from quivr_api/modules/rag_service/rag_service.py, change line 92 and add + "/v1" to ollama_url in the llm_base_url assignment.
now I get a different error, it just says the model is not found | openai.NotFoundError: Error code: 404 - {'error': {'message': 'model "ollama/llama2" not found, try pulling it first', 'type': 'api_error', 'param': None, 'code': None}} so the API working, we just sent it a bad model. it probably expects "llama2" so change line 91
hardcode it, who cares for now. thus, now, is working. it is complete |
as long as the API works the same you literally just type in llama3 wherever you put llama2 before. |
@c4801725870 I have tried and still cannot seem to get it to work. Here is my complete process I just followed:
Seems to still want to use gpt-3.5. Any help is appreciated. |
my user_settings only includes ollama also, in quivr_api/modules/chat/controller/chat/utils.py change line 45
as you can see they hardcoded it to the gpt-3.5 and there is no consideration for other models. |
@PFLigthart i took another look to see if i can make it so i can switch models. Unfortunately we run into the issue that the "default model" referenced in quivr_api/modules/chat/controller/chat/utils.py is called upon brain creation, because the row in the Brains table defaults "model" to NULL. Also I thought it would be worth mentioning, you need to add the model to the models table, just adding a row for the new model. I proved this by changing the code to use "brain_model" rather than "default_model". `# If brain.model is None, set it to the default_model
for the quivr_api/modules/rag_service/rag_service.py change line 91 to The bad news I have so far is the new code seems really bad at tokenizing the data input when the brain is created. This project needs a lot of work. |
@c4801725870 this worked. Thank you so much. I was also able to get it to work with llama3 and llama3.1 by hardcoding the relevant entries. Appreciate it. |
the lastest version has no models column in user_settings table, but has a new table named models. So I add ollama mistral model to this table, but when I send ask a question in quivr, the error is "TypeError : network error", and in ollama log, it received a post request with uri of "/chat/completions", post data is ok with model name "mistra", but as [https://github.com/ollama/ollama?tab=readme-ov-file#chat-with-a-model](ollama chat-with-a-model) says, quivr should post to uri of "/api/chat". So is the uri "/chat/completions" correct? |
@caishanli need to make sure the model in models table has ollama/ as part of the string, literally ollama/mistral for example. While reading the code i notice it only detects ollama if it actually finds ollama as part of the model string. Simply putting mistral for model will detect as GPT and probably use wrong API which you experienced. This is why line 91 need to use string.split method to extract model type. If I just changed to llama3 instead of ollama/llama3 it would not hit that branch of code... Again this project needs a lot of work and you need to carefully examine the code to get it functioning. It is not in the author's interest to produce a repo that allows full offline functionality out of the box for free. |
Thank you for your reply! I tested ollama/mistral, the result is the same. The main problem is that quivr send wrong post data to ollama, shortly quivr use the wrong ollama rest api, so ollama can't reply correctly. |
mine does not working so I have to import
this change breaks brain ollama: 0.3.9 |
The Supabase has completely new interface now and I failed to configure llamma3.1 using the new Supabase. The column of "models" are removed from user settings and I have to add in Ollama to "models" table, but I can't manage to get it to work. Does anyone succeed to do that using the new Supabase? |
@aishock I believe I'm using same version as you. I also update the model on "models" table instead. But some of the codes need to be changed because they are hardcoded to OpenAI. Changes:.env.example
to:
backend/api/quivr_api/modules/rag_service/rag_service.py llm_config=LLMEndpointConfig(
model=self.model_to_use, # type: ignore
llm_base_url=model.endpoint_url,
llm_api_key=api_key, to llm_config=LLMEndpointConfig(
model="llama2", # self.model_to_use, # type: ignore
llm_base_url= "http://host.docker.internal:11434", # model.endpoint_url,
llm_api_key=api_key, backend/core/quivr_core/llm/llm_endpoint.py from langchain_openai import AzureChatOpenAI, ChatOpenAI to from langchain_openai import AzureChatOpenAI, ChatOpenAI
from langchain_community.chat_models import ChatOllama Update _llm = ChatOpenAI(
model=config.model,
api_key=SecretStr(config.llm_api_key) to: _llm = ChatOllama(
model="llama2", #config.model,
api_key=SecretStr(config.llm_api_key) backend/supabase/config.toml
to:
backend/supabase/seed.sql INSERT INTO "public"."brains"...
INSERT INTO "public"."brains_users"... Update INSERT INTO "public"."models" ("name", "price", "max_input", "max_output", "description", "display_name", "image_url", "default", "endpoint_url", "env_variable_name") VALUES
('gpt-4-0125-preview', 1, 4000, 4000, 'Default Description', 'GPT4', 'https://quivr-cms.s3.eu-west-3.amazonaws.com/logo_quivr_white_7e3c72620f.png', false, 'https://api.openai.com/v1', 'OPENAI_API_KEY'),
('gpt-3.5-turbo-0125', 1, 10000, 1000, 'Default Description', 'GPT-3.5', 'https://quivr-cms.s3.eu-west-3.amazonaws.com/logo_quivr_white_7e3c72620f.png', true, 'https://api.openai.com/v1', 'OPENAI_API_KEY'); to: INSERT INTO "public"."models" ("name", "price", "max_input", "max_output", "description", "display_name", "image_url", "default", "endpoint_url", "env_variable_name") VALUES
('ollama/llama2', 1, 10000, 1000, 'Default Description', 'ollama/llama2', 'https://quivr-cms.s3.eu-west-3.amazonaws.com/logo_quivr_white_7e3c72620f.png', true, 'http://host.docker.internal:11434', 'OLLAMA_API_BASE_URL'); backend/supabase/migrations/20240103173626_init.sql and backend/supabase/migrations/20240103175048_prod.sql After changing the code, you need to rebuild the image and supabase |
What happened?
I am trying to run Ollama there's no specific command I found that run Ollama private model.
There's any further guidance available with current repo?
Relevant log output
Just looking forward to Ollama show into my bot, I will generate my answer using Ollama llama3-8b model!
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: