koboldcpp-1.57 - CUDA 12.3 build
·
102 commits
to concedo
since this release
I have merged the (currently unmerged) llama.cpp PR for Mixtral prompt processing to be faster. Should be about a ~1.25x prompt processing speed improvement for all CPU layers.