Cyber Posture

CVE-2025-24357

HighRCE

Published: 27 January 2025

Published
27 January 2025
Modified
27 June 2025
KEV Added
Patch
CVSS Score 7.5 CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H
EPSS Score 0.0101 77.2th percentile
Risk Priority 16 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2025-24357 is a high-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Vllm Vllm. Its CVSS base score is 7.5 (High).

Operationally, exploitation aligns with the MITRE ATT&CK technique Exploitation for Client Execution (T1203); ranked in the top 22.8% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.

This vulnerability is AI-related — categorised as NLP and Transformers; in the Supply Chain and Deployment risk domain.

The strongest mitigations our analysis identified are NIST 800-53 CM-6 (Configuration Settings) and SI-2 (Flaw Remediation).

Threat & Defense at a Glance

What attackers do: exploitation maps to Exploitation for Client Execution (T1203) and 1 other technique. What defenders deploy: see the NIST 800-53 controls recommended below.
Threat & Defense Details

Mitigating Controls (NIST 800-53 r5)AI

prevent

Requires timely identification, reporting, and remediation of the deserialization flaw by patching vLLM to version 0.7.0 or later where torch.load uses weights_only=True by default.

prevent

Mandates integrity checks on model checkpoints prior to loading and execution, preventing arbitrary code execution from malicious pickle data downloaded from Hugging Face.

prevent

Enforces secure baseline configuration settings for libraries like PyTorch's torch.load, such as enabling weights_only=True, to block unsafe deserialization of untrusted model files.

MITRE ATT&CK Enterprise TechniquesAI

T1203 Exploitation for Client Execution Execution
Adversaries may exploit software vulnerabilities in client applications to execute code.
T1195.002 Compromise Software Supply Chain Initial Access
Adversaries may manipulate application software prior to receipt by a final consumer for the purpose of data or system compromise.
Why these techniques?

Deserialization vulnerability via torch.load(pickle) with weights_only=False enables arbitrary code execution from malicious Hugging Face model checkpoints, facilitating exploitation for client execution and supply chain compromise through tainted software dependencies.

NVD Description

vLLM is a library for LLM inference and serving. vllm/model_executor/weight_utils.py implements hf_model_weights_iterator to load the model checkpoint, which is downloaded from huggingface. It uses the torch.load function and the weights_only parameter defaults to False. When torch.load loads malicious pickle data,…

more

it will execute arbitrary code during unpickling. This vulnerability is fixed in v0.7.0.

Deeper analysisAI

CVE-2025-24357 is a deserialization vulnerability (CWE-502) in the vLLM library, which provides efficient inference and serving for large language models. The issue lies in the hf_model_weights_iterator function within vllm/model_executor/weight_utils.py. This function downloads model checkpoints from Hugging Face and loads them using PyTorch's torch.load with the weights_only parameter defaulting to False, enabling arbitrary code execution during unpickling of malicious pickle data.

Attackers can exploit this vulnerability by publishing a malicious model checkpoint to Hugging Face. Any user running affected versions of vLLM who loads the checkpoint will trigger remote code execution on their system, as the CVSS v3.1 base score of 7.5 (AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H) indicates network accessibility, no privileges required, user interaction via model loading, high attack complexity, and high impacts on confidentiality, integrity, and availability.

The vulnerability was fixed in vLLM version 0.7.0. Mitigation involves upgrading to v0.7.0 or later. Relevant resources include the GitHub security advisory at GHSA-rh4j-5rhw-hr54, the fixing pull request #12366, and commit d3d6bb13fb62da3234addf6574922a4ec0513d04; PyTorch documentation for torch.load also notes risks associated with weights_only=False.

This vulnerability highlights risks in AI/ML pipelines, as vLLM is widely used for LLM serving, emphasizing the need to validate model sources and enforce secure deserialization practices.

Details

CWE(s)

Affected Products

vllm
vllm
≤ 0.7.0

AI Security AnalysisAI

AI Category
NLP and Transformers
Risk Domain
Supply Chain and Deployment
OWASP Top 10 for LLMs 2025
None mapped
Classification Reason
vLLM is a library specifically for LLM inference and serving, where LLMs are based on transformer architectures, making 'NLP and Transformers' the most fitting category.

CVEs Like This One

CVE-2024-11041Same product: Vllm Vllm
CVE-2025-29783Same product: Vllm Vllm
CVE-2026-27893Same product: Vllm Vllm
CVE-2025-66448Same product: Vllm Vllm
CVE-2025-62164Same product: Vllm Vllm
CVE-2026-22807Same product: Vllm Vllm
CVE-2026-22773Same product: Vllm Vllm
CVE-2026-22778Same product: Vllm Vllm
CVE-2026-25960Same product: Vllm Vllm
CVE-2026-24779Same product: Vllm Vllm

References