-
Notifications
You must be signed in to change notification settings - Fork 158
[Bug]: deepseek-v2-lite-w8a8 精度不对 #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
离线测试:
结果:
|
I can get correct output with TP=2 following: #628 (comment). TP=1 still needs to be fixed though. |
@learning-chip I tested code:
result:
|
The latest suitable tag for msmodelslim is during the adaptation, please pay attention to the document update |
@Potabk will the model on https://www.modelscope.cn/models/vllm-ascend/DeepSeek-V2-Lite-w8a8 also be updated? I directly downloaded the model without using msmodelslim for quantization. |
@tangzhiyi11 this model is the artificial strictly follow the https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_quantization.html , once the msmodelslim tag is landed, will update the weight |
@Potabk ok, thx. By the way, where can I download the weights for deepseek-r1-w8a8? |
https://www.modelscope.cn/models/vllm-ascend/DeepSeek-R1-W8A8. We are uploading the weights. |
@22dimensions I have downloaded the weights for DeepSeek-R1-W8A8, and I have a few questions regarding them that I would like to ask for clarification.
|
Your current environment
🐛 Describe the bug
下载的模型地址:https://www.modelscope.cn/models/vllm-ascend/DeepSeek-V2-Lite-w8a8
启动服务:
请求:
返回结果:
完全参照 https://vllm-ascend.readthedocs.io/en/latest/tutorials/multi_npu_quantization.html 操作。
环境:
vllm-ascend main 分支
vllm v0.8.5.post1
cann: 8.1.RC1
npu: 910b
The text was updated successfully, but these errors were encountered: