How do I configure llama.cpp to use my iGPU instead of the GPU? #12443

ZecaStevenson · 2025-03-18T05:25:54Z

ZecaStevenson
Mar 18, 2025

I have an iGPU and a GPU:
$ vulkaninfo | grep deviceName
deviceName = AMD Radeon Graphics (RADV GFX1103_R1)
deviceName = AMD Radeon™ RX 7600 XT (RADV NAVI33)

I would like to run 1 instance of llama.cpp using my GPU - AMD Radeon™ RX 7600 XT (RADV NAVI33) (this is currently working fine).

And another instance on a different port using the iGPU - AMD Radeon Graphics (RADV GFX1103_R1).

I've tried multiple settings (for MESA_VK_DEVICE_SELECT and for LLAMA_ARG_MAIN_GPU) but llama.cpp always pick the GPU instead of the iGPU:
$ cat compose.yaml
services:
llama_server:
image: ghcr.io/ggml-org/llama.cpp:server-vulkan
ports:
- "8000:8000"
volumes:
- ./models:/models
devices:
- "/dev/dri:/dev/dri"
environment:
MESA_VK_DEVICE_SELECT: 0
LLAMA_ARG_MODEL: /models/Qwen2.5-3B.Q4_K_M.gguf
LLAMA_ARG_CTX_SIZE: 4096
LLAMA_ARG_N_PARALLEL: 1
LLAMA_ARG_ENDPOINT_METRICS: 1
LLAMA_ARG_HOST: 0.0.0.0
LLAMA_ARG_PORT: 8000
LLAMA_ARG_MAIN_GPU: 1
LLAMA_ARG_N_GPU_LAYERS: 100

Any help would be really appreciated!

Answered by 0cc4m

Mar 18, 2025

You can set the Vulkan device(s) to be used with GGML_VK_VISIBLE_DEVICES in a similar way to how it works with CUDA. In your case you would use GGML_VK_VISIBLE_DEVICES=0 for your iGPU, GGML_VK_VISIBLE_DEVICES=1 for your dGPU (which is also what it defaults to), or GGML_VK_VISIBLE_DEVICES=0,1 or even 1,0 for both.

View full answer

0cc4m · 2025-03-18T08:29:42Z

0cc4m
Mar 18, 2025
Collaborator

You can set the Vulkan device(s) to be used with GGML_VK_VISIBLE_DEVICES in a similar way to how it works with CUDA. In your case you would use GGML_VK_VISIBLE_DEVICES=0 for your iGPU, GGML_VK_VISIBLE_DEVICES=1 for your dGPU (which is also what it defaults to), or GGML_VK_VISIBLE_DEVICES=0,1 or even 1,0 for both.

1 reply

ZecaStevenson Mar 18, 2025
Author

Thank you so much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I configure llama.cpp to use my iGPU instead of the GPU? #12443

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

How do I configure llama.cpp to use my iGPU instead of the GPU? #12443

ZecaStevenson Mar 18, 2025

Replies: 1 comment · 1 reply

0cc4m Mar 18, 2025 Collaborator

ZecaStevenson Mar 18, 2025 Author

ZecaStevenson
Mar 18, 2025

Replies: 1 comment 1 reply

0cc4m
Mar 18, 2025
Collaborator

ZecaStevenson Mar 18, 2025
Author