Offloading into shared part of VRAM #1413

2Fort2 · 2025-03-09T06:58:26Z

Koboldcpp is offloading into the shared part of GPU memory instead of the dedicated part.

Other things I've noticed: says it is "unable to detect VRAM" on launch, and "device vulkan0 does not support async, host buffers or events" while offloading.

OS: Windows 10 IoT Enterprise LTSC
CPU: Ryzen 5 1600
GPU: Radeon RX 9070
Kobold version: koboldcpp_nocuda 1.85.1

LostRuins · 2025-03-09T07:24:21Z

Can you try select a different GPU from the dropdown? You might have a few GPUs and it could have picked the wrong one (the iGPU instead).

2Fort2 · 2025-03-09T07:36:54Z

Hey, thanks for the fast reply.

GPU 1 is identified as the 9070. The other ones have no identifier and I've tried them... the program just crashes.
Also I tried on the normal koboldcpp as opposed to the nocude version, same result, just loads into the shared memory which is just normal RAM.

LostRuins · 2025-03-09T07:49:20Z

Alright, so only ID 1 works. If you try manually setting the layers, does it work? Let's try setting 50 layers.

2Fort2 · 2025-03-09T08:07:57Z

Setting manually yields the same result. Here's the logs.

LostRuins · 2025-03-09T14:03:13Z

Looks like you loaded it correctly. I can see the AMD Radeon RX 9070 being selected in the log. And I see Vulkan0 model buffer size as 6GB, which looks correct. So the offload should be working. If you do a partial offload e.g. 20 layers, you'll see this value change (and it should also reflect in task manager)

2Fort2 · 2025-03-09T19:56:25Z

I've tried a partial offload, it does change the value displayed, but doesn't change the final result.

Strange that even with the partial offload, it still seem to occupy the shared part of the GPU as opposed to leave that bit a bit more empty since technically it's been loaded into normal RAM.

I've also tried to load models using LM studio, there seems to be the exact same issue. But given how both koboldcpp and LM studio uses vulkan llamacpp, it would make sense. Could this just be that vulkan llamacpp is currently bugged with the 9070 and just has trouble identifying the VRAM so is opting not to load into it?

LostRuins · 2025-03-10T02:16:28Z

I doubt that. Instead, it could be your GPU settings then.

Is it possible you configured your AMD graphics settings to always prefer shared memory? Perhaps some sort of integrated graphics power saver mode?

If you load a normal GPU workload, say a video game, does it also use shared memory?

One more option would be to try the rocm fork https://github.com/YellowRoseCx/koboldcpp-rocm/releases/tag/v1.85.yr0-ROCm and see if that works for you.

2Fort2 · 2025-03-10T02:44:38Z

Where could I check the GPU settings? I have the AMD Adrenaline software, but it doesn't even mention shared memory in there at all.

Games work fine, they all take up the dedicated memory.

I have tried the ROCm version, it says there's no ROCm devices found and just loads into normal RAM, which isn't surprising as 9070/XT don't have support for it from the get go.

henk717 · 2025-03-13T10:32:16Z

Others have reported the same thing and it seems to be a driver issue. Yesterday I helped someone where this did not occur and occam (vulkan developer) confirmed this is beyond his control. Double check that you are on the latest driver. If you are assuming theu were to I don't know why the driver does not properly load to vram like it says it does.

2Fort2 · 2025-03-14T01:36:54Z

Hey, thanks for the reply.

Yeah, I am on the latest drivers according to AMD Adrenaline. Still loads into shared VRAM. I guess all that could be done is keep waiting for a fix?

It doesn't seem to affect everyone on the 9070/XT though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offloading into shared part of VRAM #1413

Offloading into shared part of VRAM #1413

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 10, 2025

2Fort2 commented Mar 10, 2025

henk717 commented Mar 13, 2025

2Fort2 commented Mar 14, 2025

Offloading into shared part of VRAM #1413

Offloading into shared part of VRAM #1413

Comments

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 9, 2025

2Fort2 commented Mar 9, 2025

LostRuins commented Mar 10, 2025

2Fort2 commented Mar 10, 2025

henk717 commented Mar 13, 2025

2Fort2 commented Mar 14, 2025