You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As mentioned above, Ascend NPU also supports AI frameworks like PyTorch, TensorFlow, which has already integrated by huggingface, deepspeed, and libraries such as Transformers/Accelerate for traning and fine-tune.
This patch address two aspects:
Add code in ramalama to build llama.cpp-cann binary and docker image.
Add code in ramalama to select vllm ascend docker image.
The text was updated successfully, but these errors were encountered:
I want to integrates Ascend NPU hardware capabilities into the ramalama, and enables users to leverage the NPUs for inference and serving of LLMs.
Ascend NPU is an AI processor that has supports LLM inference engines such as vllm, llama.cpp, onnxruntime.
As mentioned above, Ascend NPU also supports AI frameworks like PyTorch, TensorFlow, which has already integrated by huggingface, deepspeed, and libraries such as Transformers/Accelerate for traning and fine-tune.
This patch address two aspects:
Add code in ramalama to build llama.cpp-cann binary and docker image.
Add code in ramalama to select vllm ascend docker image.
The text was updated successfully, but these errors were encountered: