Embedding unreachable, but llama is running #4112

LucaFulchir · 2025-04-03T20:29:21Z

Hello, I'm trying to run/test tabby, but I have problems with the embedding instance
Using version 0.27, NixOS unstable server.

Ai completion and Ai chat seem to work, but I can not add a git context provider of a public repo, it seems to clone successfully but can't parse a single file.

config.toml:

[model.completion.local]
model_id = "Qwen2.5-Coder-3B"

[model.chat.local]
model_id = "Qwen2.5-Coder-1.5B-Instruct"

[model.embedding.local]
model_id = "Nomic-Embed-Text"

running with:

tabby serve --model Qwen2.5-Coder-3B --host 192.168.1.10 --port 11029 --device rocm

testing on AMD Ryzen 7 8845HS w/ Radeon 780M Graphics

on the tabby web interface, on the systems page I see "Unreachable" only under "Enbedding", with error "error decoding response body"

The llama instance seems to be UP and by dumping the local traffic I see the following req/responses:

GET /health HTTP/1.1
accept: */*
host: 127.0.0.1:30888

HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Length: 15
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp
------
POST /tokenize HTTP/1.1
content-type: application/json
accept: */*
host: 127.0.0.1:30888
content-length: 25

{"content":"hello Tabby"}

HTTP/1.1 200 OK
Access-Control-Allow-Origin: 
Content-Length: 28
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp

{"tokens":[7592,21628,3762]}
-----------
POST /embeddings HTTP/1.1
content-type: application/json
accept: */*
host: 127.0.0.1:30888
content-length: 27

{"content":"hello Tabby\n"}

HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Length: 16226
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp

{"embedding":[0.0018252730369567871, **a lot more floats**,-0.024591289460659027],"index":0}

Additional tabby logging even when running with RUST_LOG=debug are all like:

WARN tabby_index::indexer: crates/tabby-index/src/indexer.rs:90: Failed to build chunk for document 'git:R1AWw5:::{"path":"/var/lib/tabby/repositories/[redacted]/src/connection/handshake/dirsync/req.rs","language":"rust","git_hash":"906b1491a1a0ecb98781568b24d8ba781d6765e2"}': Failed to embed chunk text: error decoding response body

what can I try/what I am doing wrong?

The text was updated successfully, but these errors were encountered:

LucaFulchir · 2025-04-11T18:38:06Z

updated to 0.27.1, tried different models thanks to more ram, the local embedding is still marked as 'unreachable', same errors

LucaFulchir · 2025-04-12T17:11:04Z

workaround: use http for embeddings, not local

I literally copied the llama-server cmdline and ran llama manually.
connecting this way works

[model.embedding.http]
kind = "llama.cpp/embedding"
model_name = "Nomic-Embed-Text"
api_endpoint = "http://127.0.0.1:30887"

I think the kind is wrong when using the [model.embedding.local] at this point

zwpaper · 2025-04-18T02:25:08Z

hi @LucaFulchir, Did you utilize the llama.cpp that came with Tabby, or was it installed manually as a separate component?

LucaFulchir · 2025-04-27T10:24:53Z

tabby is configured to use nixos llama.cpp, built to use vulkan.
Currently seems to be release b4154

Now I notice that when I run llama manually instead it uses realase b5141, which is much newer

zwpaper · 2025-05-03T17:20:23Z

The llama.cpp included in the Tabby release should be functional.

The llama.cpp embedding API has been updated post-build b4356. Please verify your version and configure it appropriately.

for more detail, you could check: https://tabby.tabbyml.com/docs/references/models-http-api/llama.cpp/

LucaFulchir added the bug-unconfirmed label Apr 3, 2025

zwpaper closed this as completed May 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Embedding unreachable, but llama is running #4112

Embedding unreachable, but llama is running #4112

LucaFulchir commented Apr 3, 2025 •

edited

Loading

LucaFulchir commented Apr 11, 2025

Uh oh!

LucaFulchir commented Apr 12, 2025

Uh oh!

zwpaper commented Apr 18, 2025

Uh oh!

LucaFulchir commented Apr 27, 2025

Uh oh!

zwpaper commented May 3, 2025

Uh oh!

Embedding unreachable, but llama is running #4112

Embedding unreachable, but llama is running #4112

Comments

LucaFulchir commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LucaFulchir commented Apr 11, 2025

Uh oh!

LucaFulchir commented Apr 12, 2025

Uh oh!

zwpaper commented Apr 18, 2025

Uh oh!

LucaFulchir commented Apr 27, 2025

Uh oh!

zwpaper commented May 3, 2025

Uh oh!

LucaFulchir commented Apr 3, 2025 •

edited

Loading