Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searxng and Perplexica are not working properly #60

Open
nullnuller opened this issue Oct 14, 2024 · 1 comment
Open

Searxng and Perplexica are not working properly #60

nullnuller opened this issue Oct 14, 2024 · 1 comment
Labels
good first issue Good for newcomers question Further information is requested

Comments

@nullnuller
Copy link

nullnuller commented Oct 14, 2024

All config are default, nothing changed after git clone and habor installation steps.
I am using ollama instance running outside of docker (not the harbor ollama), which can be accessed by webui when used for chat.

(base) harbor$ harbor doctor
19:48:52 [INFO] Running Harbor Doctor...
19:48:52 [INFO] ✔ Docker is installed and running
19:48:52 [INFO] ✔ Docker Compose (v2) is installed
19:48:52 [INFO] ✔ .env file exists and is readable
19:48:52 [INFO] ✔ default profile exists and is readable
19:48:52 [INFO] ✔ Harbor workspace directory exists
19:48:52 [INFO] ✔ CLI is linked
19:48:52 [WARN] ✘ NVIDIA Container Toolkit is not installed. NVIDIA GPU support may not work.
19:48:52 [INFO] Harbor Doctor checks completed successfully.

harbor info

Harbor CLI version: 0.2.11
==========================
19:48:47 [INFO] Harbor active services:
boost
ollama
perplexica
perplexica-be
searxng
stt
tts
webui
==========================
Client: Docker Engine - Community
 Version:    27.3.1
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.17.1
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.7
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 13
  Running: 8
  Paused: 0
  Stopped: 5
 Images: 56
 Server Version: 27.3.1
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-41-generic
 Operating System: Ubuntu 24.04.1 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 56
 Total Memory: 251.8GiB
 Name: 
 ID:
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

Searxng seem to return some results but they don't get passed to any LLM I try.
image

searxng log

(base) nulled@mail:~/Downloads/LLM_Applications/harbor$ harbor logs searxng
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1617, in send
harbor.searxng  |     response = await self._send_handling_auth(
harbor.searxng  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1645, in _send_handling_auth
harbor.searxng  |     response = await self._send_handling_redirects(
harbor.searxng  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1682, in _send_handling_redirects
harbor.searxng  |     response = await self._send_single_request(request)
harbor.searxng  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_client.py", line 1719, in _send_single_request
harbor.searxng  |     response = await transport.handle_async_request(request)
harbor.searxng  |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 352, in handle_async_request
harbor.searxng  |     with map_httpcore_exceptions():
harbor.searxng  |          ^^^^^^^^^^^^^^^^^^^^^^^^^
harbor.searxng  |   File "/usr/lib/python3.12/contextlib.py", line 158, in __exit__
harbor.searxng  |     self.gen.throw(value)
harbor.searxng  |   File "/usr/lib/python3.12/site-packages/httpx/_transports/default.py", line 77, in map_httpcore_exceptions
harbor.searxng  |     raise mapped_exc(message) from exc
harbor.searxng  | httpx.ConnectTimeout

Perplexica keeps looking for an answer, never ends.

image

harbor.perplexica  | yarn run v1.22.22
harbor.perplexica  | $ next start
harbor.perplexica  |    ▲ Next.js 14.1.4
harbor.perplexica  |    - Local:        http://localhost:3000
harbor.perplexica  | 
harbor.perplexica  |  ✓ Ready in 1223ms
@av
Copy link
Owner

av commented Oct 14, 2024

Hi, thanks for the detailed report!

There are a few items, I'll try to cover them one-by-one:

I am using ollama instance running outside of docker

That's perfectly fine, but I'd recommend removing ollama from the default services in Harbor in such case:

harbor defaults # see current
harbor defaults rm ollama # remove built-in ollama

Otherwise - all Harbor services will try using internal Ollama instance. It shares model cache with the host instance, which is why you might be seeing the same models available.

Note that removing it from defaults means that all the services talking to built-in Ollama won't be doing so configured to do so anymore (and you'll have to configure them manually). An alternative to avoid manual reconfiguration is to replace Harbor's internal Ollama URL with yours, as described in this issue

# 172.17.0.1 is the IP of your host within the container
# 11434 is the port for Ollama on the host
harbor config set ollama.internal_url  http://172.17.0.1:11434

Searxng seem to return some results but they don't get passed to any LLM I try.

This looks like one of two possible things:

Web search is enabled after the initial generation on the message

To solve, ensure that you're turning on "web search" before the first generation in the conversation, otherwise the RAG template might not be applied to the message upon first generation (and model won't see anything). Certain models are also very overfit to reply that they don't have access to the "current" information on such requests, even despite they do via RAG (Qwen 2.5 shouldn't be one of them, though, but I only tested with up to 14B)

Embedding model missing

By default, Open WebUI is configured with mxbai-embed-large:latest as embedder, it needs to be pulled via Ollama, otherwise the searches will fail to be embedded:

harbor ollama pull mxbai-embed-large:latest

If these doesn't help, I'd take a look at verbose logs of Open WebUI itself to understand specific prompts and content sent to the Ollama API.

Perplexica keeps looking for an answer, never ends.

Unfortunately, some of the settings (Embeddings config) can't be pre-configured so have to be done via the Perplexica UI. This is mentioned in the Service Wiki, I should make it more prominent there.

When working with Perplexica + Ollama combo there are a few things to keep in mind:

  • Check if the embedding and LLM models are what you want to use, they'll have a significant impact on overall quality of the output
  • Perplexica wants to make A LOT of embeddings
  • Ollama runs without concurrency by default - only one model (includes embedding models) can be loaded at any given moment in time, same applies for requests. So "normal" pipeline for Perplexica - query SearXNG, load embedding model in Ollama, run 15-20 embedding requests, load LLM in Ollama, run generate requests - it takes some time to process

In your instance specifically, it'll be very slow, due to a combination of factors:

  • Harbor's ollama runs by default - Perplexica will connect to it
  • Nvidia Container Toolkit not installed - internal Ollama runs on CPU, hence Perplexica embeddings and LLM generation will be very slow

@av av added good first issue Good for newcomers question Further information is requested labels Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants