Need assistance - installed via bash...chatbox empty response #3595
-
I have been lurking here for some days and finally wanted to give this a try. I have great expectations from this! In my windows (32GB RAM, Ryzen 5800H), I run bash and run this command: "curl https://localai.io/install.sh | sh". I had Docker Desktop installed earlier, so the console said detected docker and pulled the images but gave error on port 8080. Ok, added Port parameter to 9000 and ran again. It completed and I go on to 127.0.0.1:9000. Web app works, and shows no models. I installed 9 models, which also support CPU (no GPU in my AMD mini pc). See attachment. Now, no matter what model I choose, the chatbox just comes back with no response. See attached. Next I tried text to speech with stable diffusion model. Got model load error. See attached. Did anyone run into this issue? My system does have enough resources for the models to be loaded so am wanting to see what is going wrong and how can I fix it. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
working albeit slowly it seems. |
Beta Was this translation helpful? Give feedback.
-
True. I felt the same way. Looks like the owner and/or no one else tested
this under windows/wsl. I don’t have any terminals/cmd/powershell and
somehow my local instance of LocalAI still runs. Need to really see if the
install script somehow made it always running. Genuinely expected someone
here to help me rather than calling me troll.
Nevertheless, you are right, I might as well end up going the same route as
you. But curious what you meant by saying higher tokens/second? If you are
using all local LLMs that wouldn’t make much difference AFAIK, no?
Thanks for responding!
…On Sat, Sep 21, 2024 at 5:16 PM helf-charles ***@***.***> wrote:
I can't find a way to message you directly, and I'm loathe to post my
reply to you here, as I have a sneaking suspicion that every post I make
from now on will be labeled as "trolling". But I genuinely feel bad that
the owner of the repo seemed to believe that you and I are the same person,
based on the now-deleted thread.
LocalAI is a rather bulky application because it includes a rather
involved front end. If you don't have a compatible GPU, you might be better
off running Llama.cpp directly, instead. It runs in terminal, rather than
GUI, but my experience has been that it results in significantly higher
tokens-per-second than through LocalAI.
—
Reply to this email directly, view it on GitHub
<#3595 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA7IJSZ2VGWTJMCIVQWRE3LZXXO2BAVCNFSM6AAAAABOM2DRYCVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTANZRGUYDGNI>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
So, with LLMs, the models are effectively just the probabilistic data bank. They don't do anything, in of themselves. They require an inference engine to load portions of the model into memory, and then use those portions of the model to transform user input into response output. The inference engine is a series of low-level processes meant to execute these transformations. Generally, efficiency is extremely important, as language models require truly massive numbers of calculations.
Thus, the faster and more lightweight the inference engine's architecture, the more rapidly it will be able to use the loaded portions of a language model to generate responses. Llama.cpp is basically the gol…