[Bug]: 不支持quantization为ascend的量化 #902

daiqifeng-sys · 2025-05-20T01:21:30Z

Your current environment

vllm serve /tmp/modelscope/hub/models/QwQ-32B-w8a8/ --served-model-name qwq-32b-w8a8 --host 0.0.0.0 --port 9007 -tp 4 --max-model-len 32768 --quantization ascend

提示不支持ascend。
镜像是：m.daocloud.io/quay.io/ascend/vllm-ascend:v0.8.5rc1。

🐛 Describe the bug

usage: vllm serve [model_tag] [options]
vllm serve: error: argument --quantization/-q: invalid choice: 'ascend' (choose from 'aqlm', 'awq', 'deepspeedfp', 'tpu_int8', 'fp8', 'ptpc_fp8', 'fbgemm_fp8', 'modelopt', 'nvfp4', 'marlin', 'bitblas', 'gguf', 'gptq_marlin_24', 'gptq_marlin', 'gptq_bitblas', 'awq_marlin', 'gptq', 'compressed-tensors', 'bitsandbytes', 'qqq', 'hqq', 'experts_int8', 'neuron_quant', 'ipex', 'quark', 'moe_wna16', 'torchao', None)

The text was updated successfully, but these errors were encountered:

wangxiyuan · 2025-05-20T01:30:39Z

online mode only works with --quantization ascend by this PR 00e0243

0.8.5 doesn't work. can you cherry-pick this PR by hand and try again?

daiqifeng-sys added the bug Something isn't working label May 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: 不支持quantization为ascend的量化 #902

[Bug]: 不支持quantization为ascend的量化 #902

daiqifeng-sys commented May 20, 2025

wangxiyuan commented May 20, 2025 •

edited

Loading

Uh oh!

[Bug]: 不支持quantization为ascend的量化 #902

[Bug]: 不支持quantization为ascend的量化 #902

Comments

daiqifeng-sys commented May 20, 2025

Your current environment

🐛 Describe the bug

wangxiyuan commented May 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangxiyuan commented May 20, 2025 •

edited

Loading