-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Workflow update - PART 1 #1416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow update - PART 1 #1416
Conversation
I'll paste my comment here, and maybe we can open a new discussion, basically I'm concerned about the size of releases ballooning with the number of prebuilt wheel variants. I had some suggestions for long term solutions there but not sure what the right approach is. Anecdotally @oobabooga claims to have run into issues with GitHub throttling his prebuilt wheel repo because of this. |
If you generate too many wheels, there is a 100% chance you will reach a storage quota and GitHub will ask you to start paying for storage else your wheels will fail to upload. It's not too expensive (a few $ a month at most), but it's worth keeping in mind. |
I avoided the API quota limit problems by adding a timer in my yaml
|
|
Not enabling AVX penalizes LLaMa cpp python performance in both cpu and cuda. |
Maybe the list can be shrink down a bit. For example:
|
@Smartappli Your hanges are adding AVX for CUDA wheels, is that needed? At that point the user is using the GPU. It makes sense for the basic wheels to have AVX, and AVX2 wheels, not so much for the CUDA ones. |
I copy that thx @gaby in summary: AVX and AVX2 on CPU is |
Tests 11 may 24 |
ping @gaby |
@abetlen can you review plz? |
Hey @Smartappli thanks for your patience and the PR, busy month so just catching up on open PRs right now, do you mind splitting this one up into 2 with one that includes the following
and another just for the cpu wheels changes? |
|
Has anyone managed to fix the CUDA workflows? Mine keep failing with error
See: https://github.com/oobabooga/llama-cpp-python-cuBLAS-wheels/actions/runs/9457447475/job/26051277254. I see that @abetlen's workflow also fail with the same error: https://github.com/abetlen/llama-cpp-python/actions/runs/9457182450/job/26051175939 |
CUDA compiled with AVX
Remove Python 3.8
Remove macos-11 deprecated
Add python 3.9 when missing
Upgrade macos-13 to macos-latest in tests
Upgrade ubuntu-20.04 to ubuntu-latest
Upgrade windows-2019 to windows-latest
refactoring of metal building