Skip to content

0.13.3

Latest
Compare
Choose a tag to compare
@b4rtaz b4rtaz released this 09 Apr 20:14
afa6297

This version fixes the selection of memory type in Vulkan, significantly improving inference speed on NVIDIA GPUs.

With this, it's now possible to run Distributed Llama on two GPUs within the same machine, check this test.