Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU Support Missing in Version >=0.3.5 on Windows with CUDA 12.4 and RTX 3090 #1967

Open
mcglynnfinn opened this issue Mar 9, 2025 · 1 comment

Comments

@mcglynnfinn
Copy link

Issue Description:

I'm experiencing a discrepancy between version 0.3.4 and later versions (>=0.3.5) regarding GPU utilization:

Version 0.3.4 (Prebuilt Wheel):
The prebuilt wheel for 0.3.4 loads the model onto the GPU; however, it's not compatible with phi4.

Version >=0.3.5:
There are no prebuilt wheels available for these versions, and when building from source, only the CPU is being used—the model does not load onto the GPU.

System Details:

Operating System: Windows 11
CUDA Version: 12.4
GPU: RTX 3090 24GB
Steps Taken:

Installed version 0.3.4 via the prebuilt wheel – confirmed GPU loading (but phi4 incompatibility remains).
Upgraded to version 0.3.5 (and above) by building from source with CUDA support enabled.
Verified that the build settings include -DGGML_CUDA=on and confirmed that the system has CUDA 12.4 installed.
Despite these configurations, the build defaults to CPU usage, and the model never loads onto the GPU.
Could you please advise on whether this is an expected behavior for versions >=0.3.5, or if there might be an issue with GPU detection/configuration on Windows 11 with CUDA 12.4? Any guidance or troubleshooting steps to enable GPU support for these versions would be greatly appreciated.

@JamePeng
Copy link

JamePeng commented Mar 9, 2025

Maybe you can try my new prebuilt: https://github.com/JamePeng/llama-cpp-python/releases
Base on my new PR code:#1966

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants