You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
WARNING 05-08 09:54:12connector.py:342][rank71:Failedto,receive all:KVs and hidden states,redo model forwarding
The warning indicates that the p node failed to receive the kv cache and hidden states from the d node, resulting in an error code LLM_KV_CACHE_NOT_EXIST. Consequently, the p node had to redo model forwarding. Potential causes include network issues between nodes, cache not properly initialized on the d node, misconfigurations in services or nodes, request sequence problems, or resource limitations leading to request failures.
The text was updated successfully, but these errors were encountered:
please provide details about your parallelism settings, such as the tensor/data parallel size for both prefill and decode nodes, the number of npus in use, and etc.
Your current environment
The output of `python collect_env.py`
docker run -it \ --name vllm-ascend-xhy-pd-8 \ --device /dev/davinci0 \ --device /dev/davinci1 \ --device /dev/davinci2 \ --device /dev/davinci3 \ --device /dev/davinci4 \ --device /dev/davinci5 \ --device /dev/davinci6 \ --device /dev/davinci7 \ --device /dev/davinci_manager \ --device /dev/devmm_svm \ --device /dev/hisi_hdc \ -v /usr/local/dcmi:/usr/local/dcmi \ -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \ -v /etc/ascend_install.info:/etc/ascend_install.info \ -v /root/.cache:/root/.cache \ -v /mnt/nvme0/xhy/workspace:/workspace-xhy \ --privileged=true --net=host \ -it vllm-ascend-xhy-pd:v1.0 bash🐛 Describe the bug
WARNING 05-08 09:54:12connector.py:342][rank71:Failedto,receive all:KVs and hidden states,redo model forwarding
The warning indicates that the p node failed to receive the kv cache and hidden states from the d node, resulting in an error code LLM_KV_CACHE_NOT_EXIST. Consequently, the p node had to redo model forwarding. Potential causes include network issues between nodes, cache not properly initialized on the d node, misconfigurations in services or nodes, request sequence problems, or resource limitations leading to request failures.
The text was updated successfully, but these errors were encountered: