Skip to content

[Bug]: Qwen3-235B-A22B-AWQ AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping' #900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
huazq opened this issue May 19, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@huazq
Copy link

huazq commented May 19, 2025

Your current environment

The output of `python collect_env.py`
(evalscope) [root@localhost ~]# npu-smi info
+------------------------------------------------------------------------------------------------+
| npu-smi 24.1.0.3                 Version: 24.1.0.3                                             |
+---------------------------+---------------+----------------------------------------------------+
| NPU   Name                | Health        | Power(W)    Temp(C)           Hugepages-Usage(page)|
| Chip                      | Bus-Id        | AICore(%)   Memory-Usage(MB)  HBM-Usage(MB)        |
+===========================+===============+====================================================+
| 0     910B3               | OK            | 98.9        38                0    / 0             |
| 0                         | 0000:C1:00.0  | 0           0    / 0          3382 / 65536         |
+===========================+===============+====================================================+
| 1     910B3               | OK            | 101.0       39                0    / 0             |
| 0                         | 0000:C2:00.0  | 0           0    / 0          3594 / 65536         |
+===========================+===============+====================================================+
| 2     910B3               | OK            | 98.2        39                0    / 0             |
| 0                         | 0000:81:00.0  | 0           0    / 0          3594 / 65536         |
+===========================+===============+====================================================+
| 3     910B3               | OK            | 99.3        40                0    / 0             |
| 0                         | 0000:82:00.0  | 0           0    / 0          3595 / 65536         |
+===========================+===============+====================================================+
| 4     910B3               | OK            | 95.0        40                0    / 0             |
| 0                         | 0000:01:00.0  | 0           0    / 0          3597 / 65536         |
+===========================+===============+====================================================+
| 5     910B3               | OK            | 102.8       44                0    / 0             |
| 0                         | 0000:02:00.0  | 0           0    / 0          3594 / 65536         |
+===========================+===============+====================================================+
| 6     910B3               | OK            | 102.4       44                0    / 0             |
| 0                         | 0000:41:00.0  | 0           0    / 0          3593 / 65536         |
+===========================+===============+====================================================+
| 7     910B3               | OK            | 92.5        43                0    / 0             |
| 0                         | 0000:42:00.0  | 0           0    / 0          3594 / 65536         |
+===========================+===============+====================================================+

🐛 Describe the bug

[root@d721607e932e workspace]# vllm serve /mnt/models/Qwen3-235B-A22B-AWQ --served-model-name qwen3-235B --dtype bfloat16 -tp 8 --gpu-memory-utilization 0.9 --host 0.0.0.0 --port 8000 --max-model-len 8192 --trust-remote-code --quantization awq
.....
(VllmWorkerProcess pid=6341) INFO 05-19 09:38:40 [model_runner.py:953] Starting to load model /mnt/models/Qwen3-235B-A22B-AWQ...
(VllmWorkerProcess pid=6342) INFO 05-19 09:38:40 [model_runner.py:953] Starting to load model /mnt/models/Qwen3-235B-A22B-AWQ...
WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6339) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6338) INFO 05-19 09:38:40 [model_runner.py:953] Starting to load model /mnt/models/Qwen3-235B-A22B-AWQ...
(VllmWorkerProcess pid=6341) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6340) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6342) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6343) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6337) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
(VllmWorkerProcess pid=6338) WARNING 05-19 09:38:40 [utils.py:168] The model class Qwen3MoeForCausalLM has not defined packed_modules_mapping, this may lead to incorrect mapping of quantized or ignored modules
ERROR 05-19 09:38:40 [engine.py:448] 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
ERROR 05-19 09:38:40 [engine.py:448] Traceback (most recent call last):
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/engine/multiprocessing/engine.py", line 436, in run_mp_engine
ERROR 05-19 09:38:40 [engine.py:448] engine = MQLLMEngine.from_vllm_config(
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/engine/multiprocessing/engine.py", line 128, in from_vllm_config
ERROR 05-19 09:38:40 [engine.py:448] return cls(
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/engine/multiprocessing/engine.py", line 82, in init
ERROR 05-19 09:38:40 [engine.py:448] self.engine = LLMEngine(*args, **kwargs)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/engine/llm_engine.py", line 275, in init
ERROR 05-19 09:38:40 [engine.py:448] self.model_executor = executor_class(vllm_config=vllm_config)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/executor/executor_base.py", line 286, in init
ERROR 05-19 09:38:40 [engine.py:448] super().init(*args, **kwargs)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/executor/executor_base.py", line 52, in init
ERROR 05-19 09:38:40 [engine.py:448] self._init_executor()
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/executor/mp_distributed_executor.py", line 125, in _init_executor
ERROR 05-19 09:38:40 [engine.py:448] self._run_workers("load_model",
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/executor/mp_distributed_executor.py", line 185, in _run_workers
ERROR 05-19 09:38:40 [engine.py:448] driver_worker_output = run_method(self.driver_worker, sent_method,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
ERROR 05-19 09:38:40 [engine.py:448] return func(*args, **kwargs)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
ERROR 05-19 09:38:40 [engine.py:448] self.model_runner.load_model()
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
ERROR 05-19 09:38:40 [engine.py:448] self.model = get_model(vllm_config=self.vllm_config)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
ERROR 05-19 09:38:40 [engine.py:448] return loader.load_model(vllm_config=vllm_config)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
ERROR 05-19 09:38:40 [engine.py:448] model = _initialize_model(vllm_config=vllm_config)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
ERROR 05-19 09:38:40 [engine.py:448] return model_class(vllm_config=vllm_config, prefix=prefix)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
ERROR 05-19 09:38:40 [engine.py:448] self.model = Qwen3MoeModel(vllm_config=vllm_config,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
ERROR 05-19 09:38:40 [engine.py:448] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
ERROR 05-19 09:38:40 [engine.py:448] self.start_layer, self.end_layer, self.layers = make_layers(
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
ERROR 05-19 09:38:40 [engine.py:448] [PPMissingLayer() for _ in range(start_layer)] + [
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
ERROR 05-19 09:38:40 [engine.py:448] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
ERROR 05-19 09:38:40 [engine.py:448] lambda prefix: Qwen3MoeDecoderLayer(config=config,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
ERROR 05-19 09:38:40 [engine.py:448] self.self_attn = Qwen3MoeAttention(
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
ERROR 05-19 09:38:40 [engine.py:448] self.qkv_proj = QKVParallelLinear(hidden_size,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
ERROR 05-19 09:38:40 [engine.py:448] super().init(input_size=input_size,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
ERROR 05-19 09:38:40 [engine.py:448] super().init(input_size,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
ERROR 05-19 09:38:40 [engine.py:448] self.quant_method = quant_config.get_quant_method(self,
ERROR 05-19 09:38:40 [engine.py:448] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
ERROR 05-19 09:38:40 [engine.py:448] self.packed_modules_mapping):
ERROR 05-19 09:38:40 [engine.py:448] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6339) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6341) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6342) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6343) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6340) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6337) ERROR 05-19 09:38:40 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] Exception in worker VllmWorkerProcess while processing method load_model.
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] Traceback (most recent call last):
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/executor/multiproc_worker_utils.py", line 232, in _run_worker_process
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] output = run_method(worker, method, args, kwargs)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/utils.py", line 2456, in run_method
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] return func(*args, **kwargs)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/worker.py", line 235, in load_model
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.model_runner.load_model()
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/worker/model_runner.py", line 955, in load_model
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.model = get_model(vllm_config=self.vllm_config)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/init.py", line 14, in get_model
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] return loader.load_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 452, in load_model
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] model = _initialize_model(vllm_config=vllm_config)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/model_loader/loader.py", line 133, in _initialize_model
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] return model_class(vllm_config=vllm_config, prefix=prefix)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 488, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.model = Qwen3MoeModel(vllm_config=vllm_config,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/compilation/decorators.py", line 151, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 334, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.start_layer, self.end_layer, self.layers = make_layers(
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 609, in make_layers
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] [PPMissingLayer() for _ in range(start_layer)] + [
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/utils.py", line 610, in
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] maybe_offload_to_cpu(layer_fn(prefix=f"{prefix}.{idx}"))
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 336, in
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] lambda prefix: Qwen3MoeDecoderLayer(config=config,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 256, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.self_attn = Qwen3MoeAttention(
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/models/qwen3_moe.py", line 186, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.qkv_proj = QKVParallelLinear(hidden_size,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 849, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] super().init(input_size=input_size,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 395, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] super().init(input_size,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm/vllm/model_executor/layers/linear.py", line 242, in init
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.quant_method = quant_config.get_quant_method(self,
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] File "/vllm-workspace/vllm-ascend/vllm_ascend/quantization/quant_config.py", line 93, in get_quant_method
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] self.packed_modules_mapping):
(VllmWorkerProcess pid=6338) ERROR 05-19 09:38:41 [multiproc_worker_utils.py:238] AttributeError: 'AscendQuantConfig' object has no attribute 'packed_modules_mapping'
ERROR 05-19 09:38:42 [multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 6341 died, exit code: -15
ERROR 05-19 09:38:42 [multiproc_worker_utils.py:120] Worker VllmWorkerProcess pid 6343 died, exit code: -15
INFO 05-19 09:38:42 [multiproc_worker_utils.py:124] Killing local vLLM worker processes

@huazq huazq added the bug Something isn't working label May 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant