Releases · b4rtaz/distributed-llama

23 Jan 23:07

b4rtaz

v0.1.1

f2137af

0.1.1

This version introduces partial optimization for x86_64 AVX2 CPUs. Now it's possible to run the inference with Q40 weights and Q80 buffer with partial AVX2 acceleration.

Assets 2

23 Jan 22:50

b4rtaz

v0.1.0

7eb77ca

0.1.0

Initial release! 🚢

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: b4rtaz/distributed-llama

0.1.1

0.1.0