Releases: Nexesenex/croco.cpp
Croco.Cpp_FrankenFork_v1.93000_b5548_RMv1.11.3
NXS_Llama.cpp_v0.08_b5548
Merge branch 'nb_5548' into NXS_Llama.cpp
NXS_Llama.cpp_v0.06_b5506
FTYPES for IK Llama quants.
For IQ3_KS also.
And Custom quants function, see quantize.cpp help.
NXS_Llama.cpp_v0.05_b5506
Can quantize : Q6_0, IQ2_K, IQ3_K, IQ4_K, IQ5_K, IQ6_K, IQ5_KS, IQ4_KS, IQ3_KS, IQ2_KS, IQ4_KSS.
The quants will be read on Croco.cpp (last version)
v1.92120_b5506_RM1.111m
WIP.
Full Changelog: v1.92105_b5474_RM1.111m...Croco_v1.92120_b5506_RM1.111m
NXS_Llama.cpp Alpha 0.01 - b5474
First alpha version of NXS_Llama.cpp, a mainline llama koboldified, then ikawrified, based on my Croco mash-up.
Based on Llama.cpp b5474, and on some commits of IK_Llama.cpp.
Supports q6_0, IQ3_K, IQ4_K, IQ5_K, and IQ6_K in quantization.
No Cuda, nothing fancy. Just for high quality quantizations to use with Croco.cpp.
Credits :
- The authors and contributors of Llama.cpp, IK_Llama.cpp (and notably Ikawrakow), and Kobold.cpp (and notably Concedo).
NXS_Llama.cpp Alpha 0.04 - b5525
Cuda works partially.
PPL test works with Gemma 3 and Llama.
Inference works only with a head of 256, and a fp16 cache. So Gemma3 for now.
NXS_Llama.cpp Alpha 0.02 - b5517
NXS_v0.02_b5517 Aplha 0.02 - b5517 - Merge branch 'master' into NXS_Llama.cpp
Croco.Cpp_FrankenFork_v1.92105_b5474_RM1.111m
WIP.
Adds the new SWA cache implementation feature.
Works on Gemma 3 for me.
Compiled with Cuda 12.9 for Pascal, Turing, and Ampere+
Croco.Cpp_FrankenFork_v1.92060_b5427_RM1.102
WIP.
Brings back the 2nd gen of Ikawrakow's IQ_K quants (IQ4_KS, IQ2_KS). Also the new IQ5_KS.
And also his CUDA MMQ Kernels for those three, and for the 1st gen IQ_K quants as well.
Yes, my commit list is a damn mess, and the GPU-auto-layer needs to be fixed among other things.
But aside that, it works for me. +100% PP at BBS 128 for the IQ_K quants compared to Cublas mode.
Compiled on an Ampere machine, might work on Pascal and Turing as well, and of course on more recent GPUs.
Cuda release, as usual. Don't expect anything else to work.
Esobold's pdfplumber is not included, I can't compile on Windows with it.
Note : Cudart is v12.9, not 12.0 as my messy changelog says.
Full Changelog: v1.91015_b5326_RM1.100...v1.92060_b5427_RM1.102