Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use container right now #15

Merged
merged 1 commit into from
Jul 31, 2024
Merged

Use container right now #15

merged 1 commit into from
Jul 31, 2024

Conversation

ericcurtin
Copy link
Collaborator

ramalama run/serve right now require the container, it has the version of llama.cpp that works.

Long-term we may be able to remove this.

@ericcurtin ericcurtin self-assigned this Jul 31, 2024
@ericcurtin ericcurtin requested a review from rhatdan July 31, 2024 13:28
@ericcurtin ericcurtin force-pushed the ramalama-run-serve branch 2 times, most recently from fced804 to d40b962 Compare July 31, 2024 13:39
ramalama Outdated
@@ -342,6 +342,11 @@ def main(args):
conman = select_container_manager()
ramalama_store = get_ramalama_store()

if conman:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so we default to running a container for use with the server.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, the other problem is we need to add:

pip install "huggingface_hub[cli]==0.24.2"

to the install script but that's no biggie.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-pushed, it now does the re-exec in a container thing for just run/serve and installs the huggingface dependency in install script

@rhatdan
Copy link
Member

rhatdan commented Jul 31, 2024

LGTM, merge when tests pass.

@ericcurtin ericcurtin force-pushed the ramalama-run-serve branch 2 times, most recently from 180a93c to 47c3d24 Compare July 31, 2024 13:59
@@ -215,6 +215,7 @@ def list_cli(ramalama_store, args):


funcDict["list"] = list_cli
funcDict["ls"] = list_cli
Copy link
Collaborator Author

@ericcurtin ericcurtin Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ollama has ls as an alias to list, so trying to keep with "norms" from other tools, when it's easy at least

@ericcurtin ericcurtin force-pushed the ramalama-run-serve branch 2 times, most recently from 34635dd to 2f3cc71 Compare July 31, 2024 14:24
@ericcurtin
Copy link
Collaborator Author

I think it should work this time, just wanted to get a test in to prevent run/serve breakage again

If we can get the llama-cpp-python library working well, it might help reduce dependencies on containers.

But these LLM environments in general can get complex

ramalama run/serve right now require the container, it has the version
of llama.cpp that works.

Long-term we may be able to remove this.

Signed-off-by: Eric Curtin <ecurtin@redhat.com>
@ericcurtin ericcurtin merged commit 9172fbc into main Jul 31, 2024
3 checks passed
@ericcurtin ericcurtin deleted the ramalama-run-serve branch July 31, 2024 14:45
@ericcurtin
Copy link
Collaborator Author

Merging, people might try this out in the next few minutes, would hope they don't install the broken version

@ericcurtin
Copy link
Collaborator Author

We should create a stable branch soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants