CVE-2025-46560

MediumPublic PoC

Published: 30 April 2025

Published

30 April 2025

Modified

28 May 2025

KEV Added

—

Patch

—

CVSS Score v3.1 6.5 CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H

EPSS Score 0.0061 70.3th percentile

Risk Priority 13 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2025-46560 is a medium-severity Inefficient Regular Expression Complexity (CWE-1333) vulnerability in Vllm Vllm. Its CVSS base score is 6.5 (Medium).

Operationally, ranked in the top 29.7% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog; a public proof-of-concept is referenced.

This vulnerability is AI-related — categorised as NLP and Transformers; in the Adversarial Attacks risk domain.

Deeper analysis

vLLM is a high-throughput inference and serving engine for large language models. Versions 0.8.0 through 0.8.4 contain a performance vulnerability in the multimodal tokenizer's input preprocessing logic. The affected code replaces placeholder tokens such as <|audio_|> and <|image_|> by repeatedly concatenating tokens according to precomputed lengths; the use of inefficient list operations produces quadratic time complexity, enabling an attacker to supply specially crafted inputs that trigger excessive CPU and memory consumption.

An authenticated remote attacker with low privileges can submit malicious multimodal prompts to the inference engine. Because the vulnerability affects only availability, successful exploitation results in resource exhaustion that can degrade or deny service to other users without disclosing data or altering model behavior.

The issue is resolved in vLLM 0.8.5. The project security advisory GHSA-vc6m-hm49-g9qg and the corresponding code change in phi4mm.py document the patch that replaces the quadratic concatenation pattern with an efficient implementation.

The vulnerability is specific to multimodal LLM workloads and carries a CVSS score of 6.5 with a high availability impact. EPSS remains low, with a recorded peak of 0.0152.

EU & UK References

🇪🇺 ENISA EUVD: EUVD-2025-12671

Vulnerability details

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.8.0 and prior to 0.8.5 are affected by a critical performance vulnerability in the input preprocessing logic of the multimodal tokenizer. The code dynamically replaces…

placeholder tokens (e.g., <|audio_|>, <|image_|>) with repeated tokens based on precomputed lengths. Due to inefficient list concatenation operations, the algorithm exhibits quadratic time complexity (O(n²)), allowing malicious actors to trigger resource exhaustion via specially crafted inputs. This issue has been patched in version 0.8.5.

CWE(s): CWE-1333

AI Security AnalysisAI

AI Category: NLP and Transformers
Risk Domain: Adversarial Attacks
OWASP Top 10 for LLMs 2025: None mapped
Classification Reason: Matched keywords: llms, vllm

Related Threats

Affected Assets

vllm

0.8.0 — 0.8.5

Mitigating Controls

No mitigating controls mapped yet. The per-CVE control annotator has not reached this CVE.

References

https://github.com/vllm-project/vllm/blob/8cac35ba435906fb7eb07e44fe1a8c26e8744f4e/vllm/model_executor/models/phi4mm.py#L1182-L1197
Product · security-advisories@github.com
https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg
Exploit, Vendor Advisory · security-advisories@github.com
https://github.com/vllm-project/vllm/security/advisories/GHSA-vc6m-hm49-g9qg
Exploit, Vendor Advisory · 134c704f-9b21-4f2e-91b3-4a467353bcc0