-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offloading into shared part of VRAM #1413
Comments
Alright, so only ID 1 works. If you try manually setting the layers, does it work? Let's try setting 50 layers. |
Looks like you loaded it correctly. I can see the AMD Radeon RX 9070 being selected in the log. And I see |
I've tried a partial offload, it does change the value displayed, but doesn't change the final result. Strange that even with the partial offload, it still seem to occupy the shared part of the GPU as opposed to leave that bit a bit more empty since technically it's been loaded into normal RAM. I've also tried to load models using LM studio, there seems to be the exact same issue. But given how both koboldcpp and LM studio uses vulkan llamacpp, it would make sense. Could this just be that vulkan llamacpp is currently bugged with the 9070 and just has trouble identifying the VRAM so is opting not to load into it? |
I doubt that. Instead, it could be your GPU settings then. Is it possible you configured your AMD graphics settings to always prefer shared memory? Perhaps some sort of integrated graphics power saver mode? If you load a normal GPU workload, say a video game, does it also use shared memory? One more option would be to try the rocm fork https://github.com/YellowRoseCx/koboldcpp-rocm/releases/tag/v1.85.yr0-ROCm and see if that works for you. |
Where could I check the GPU settings? I have the AMD Adrenaline software, but it doesn't even mention shared memory in there at all. Games work fine, they all take up the dedicated memory. I have tried the ROCm version, it says there's no ROCm devices found and just loads into normal RAM, which isn't surprising as 9070/XT don't have support for it from the get go. |
Others have reported the same thing and it seems to be a driver issue. Yesterday I helped someone where this did not occur and occam (vulkan developer) confirmed this is beyond his control. Double check that you are on the latest driver. If you are assuming theu were to I don't know why the driver does not properly load to vram like it says it does. |
Hey, thanks for the reply. Yeah, I am on the latest drivers according to AMD Adrenaline. Still loads into shared VRAM. I guess all that could be done is keep waiting for a fix? It doesn't seem to affect everyone on the 9070/XT though. |
Koboldcpp is offloading into the shared part of GPU memory instead of the dedicated part.
Other things I've noticed: says it is "unable to detect VRAM" on launch, and "device vulkan0 does not support async, host buffers or events" while offloading.
OS: Windows 10 IoT Enterprise LTSC
CPU: Ryzen 5 1600
GPU: Radeon RX 9070
Kobold version: koboldcpp_nocuda 1.85.1
The text was updated successfully, but these errors were encountered: