CVE-2025-32444

CriticalPublic PoCRCE

Published: 30 April 2025

Published

30 April 2025

Modified

28 May 2025

KEV Added

—

Patch

—

CVSS Score v3.1 10.0 CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:H/A:H

EPSS Score 0.0545 90.4th percentile

Risk Priority 23 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2025-32444 is a critical-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Vllm Vllm. Its CVSS base score is 10.0 (Critical).

Operationally, ranked in the top 9.6% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog; a public proof-of-concept is referenced.

This vulnerability is AI-related — categorised as NLP and Transformers; in the Supply Chain and Deployment risk domain.

Deeper analysis

vLLM is a high-throughput inference and serving engine for large language models. CVE-2025-32444 affects versions 0.6.5 through 0.8.4 that enable the optional mooncake integration. The flaw stems from the use of Python pickle serialization over unauthenticated ZeroMQ sockets that bind to all network interfaces, allowing deserialization of untrusted data (CWE-502) and resulting in a CVSS 10.0 remote code execution vector.

An attacker reachable to the ZeroMQ endpoints can send a malicious pickle payload and execute arbitrary code on the vLLM host without authentication or user interaction. Only deployments that activate mooncake are exposed; instances that do not use this integration remain unaffected.

The project has released version 0.8.5, which disables the vulnerable code path. Remediation guidance and the fixing commit are documented in the GitHub Security Advisories GHSA-hj4w-hm2g-p6w5 and GHSA-x3m8-f7g5-qhm7, along with the associated pull request that removes the insecure socket configuration.

The EPSS score rose from a low baseline to a recorded peak of 0.0776, indicating emerging exploitation interest after disclosure and suggesting the issue merits renewed monitoring in LLM-serving environments.

EU & UK References

🇪🇺 ENISA EUVD: EUVD-2025-12670

Vulnerability details

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ…

sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.

CWE(s): CWE-502

AI Security AnalysisAI

AI Category: NLP and Transformers
Risk Domain: Supply Chain and Deployment
OWASP Top 10 for LLMs 2025: None mapped
Classification Reason: Matched keywords: llms, vllm

Related Threats

Affected Assets

vllm

0.6.5 — 0.8.5

Mitigating Controls

Likely Mitigating Controls AI

Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.

CA-8 Penetration Testing

addresses: CWE-502

Penetration testing supplies malicious serialized objects, detecting unsafe deserialization and supporting corrective actions.

SA-11 Developer Testing and Evaluation

addresses: CWE-502

Evaluation of untrusted data handling (deserialization testing) reveals unsafe processing, which the required remediation process addresses.

SC-44 Detonation Chambers

addresses: CWE-502

Untrusted serialized data can be deserialized and observed inside the chamber, blocking gadget-chain exploitation outside the sandbox.

SI-10 Information Input Validation

addresses: CWE-502

Validates or rejects untrusted serialized data before deserialization occurs.

SI-3 Malicious Code Protection

addresses: CWE-502

Identifies and blocks malicious code introduced through deserialization of untrusted data at system boundaries.

SI-7 Software, Firmware, and Information Integrity

addresses: CWE-502

Integrity verification of serialized information can detect tampering before deserialization occurs.

SR-4 Provenance

addresses: CWE-502

Provenance of associated data allows detection of untrusted sources before deserialization or processing occurs.

References

https://github.com/vllm-project/vllm/blob/32b14baf8a1f7195ca09484de3008063569b43c5/vllm/distributed/kv_transfer/kv_pipe/mooncake_pipe.py#L179
Product · security-advisories@github.com
https://github.com/vllm-project/vllm/commit/a5450f11c95847cf51a17207af9a3ca5ab569b2c
Patch · security-advisories@github.com
https://github.com/vllm-project/vllm/security/advisories/GHSA-hj4w-hm2g-p6w5
Exploit, Vendor Advisory · security-advisories@github.com
https://github.com/vllm-project/vllm/security/advisories/GHSA-x3m8-f7g5-qhm7
Not Applicable · security-advisories@github.com