Releases · Nexesenex/croco.cpp

WIP.
Brings back the 2nd gen of Ikawrakow's IQ_K quants (IQ4_KS, IQ2_KS). Also the new IQ5_KS.
And also his CUDA MMQ Kernels for those three, and for the 1st gen IQ_K quants as well.
Yes, my commit list is a damn mess, and the GPU-auto-layer needs to be fixed among other things.
But aside that, it works for me. +100% PP at BBS 128 for the IQ_K quants compared to Cublas mode.
Compiled on an Ampere machine, might work on Pascal and Turing as well, and of course on more recent GPUs.
Cuda release, as usual. Don't expect anything else to work.
Esobold's pdfplumber is not included, I can't compile on Windows with it.

Note : Cudart is v12.9, not 12.0 as my messy changelog says.

Full Changelog: v1.91015_b5326_RM1.100...v1.92060_b5427_RM1.102

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: Nexesenex/croco.cpp

Croco.Cpp_FrankenFork_v1.93000_b5548_RMv1.11.3

Uh oh!

NXS_Llama.cpp_v0.08_b5548

Uh oh!

NXS_Llama.cpp_v0.06_b5506

Uh oh!

NXS_Llama.cpp_v0.05_b5506

Uh oh!

v1.92120_b5506_RM1.111m

Uh oh!

NXS_Llama.cpp Alpha 0.01 - b5474

Uh oh!

NXS_Llama.cpp Alpha 0.04 - b5525

Uh oh!

NXS_Llama.cpp Alpha 0.02 - b5517

Uh oh!

Croco.Cpp_FrankenFork_v1.92105_b5474_RM1.111m

Uh oh!

Croco.Cpp_FrankenFork_v1.92060_b5427_RM1.102

Uh oh!