Remote Code Execution via Mooncake Integration

Summary

When vLLM is configured to use Mooncake, unsafe deserialization exposed directly over ZMQ/TCP will allow attackers to execute remote code on distributed hosts.

Details

Pickle deserialization vulnerabilities are well documented.
The mooncake pipe is exposed over the network (by design to enable disaggregated prefilling across distributed environments) using ZMQ over TCP, greatly increasing exploitability. Only sender_socket and receiver_ack are allowed to be accessed publicly, while the data actually decompressed by pickle.loads() comes from recv_bytes. Its interface is defined as self.receiver_socket.connect(f\"tcp://{d_host}:{d_rank_offset + 1}\"), where d_host is decode_host, a locally defined address 192.168.0.139,from mooncake.json (https://github.com/kvcache-ai/Mooncake/blob/main/doc/en/vllm-integration-v0.2.md?plain=1#L36).
The root problem is recv_tensor() calls _recv_impl which passes the raw network bytes to pickle.loads(). Additionally, it does not appear that there are any controls (network, authentication, etc) to prevent arbitrary users from sending this payload to the affected service.

Impact

This is a remote code execution vulnerability impacting any deployments using Mooncake to distribute KV across distributed hosts.

Remediation

This issue is resolved by #14228

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remote Code Execution via Mooncake Integration

Package

Affected versions

Patched versions

Description

Summary

Details

Impact

Remediation

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

CVE ID

Weaknesses

Credits