CVE-2025-48956
Published: 21 August 2025
Summary
CVE-2025-48956 is a high-severity Uncontrolled Resource Consumption (CWE-400) vulnerability in Vllm Vllm. Its CVSS base score is 7.5 (High).
Operationally, ranked in the top 45.8% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.
This vulnerability is AI-related — categorised as NLP and Transformers; in the Other ATLAS/OWASP Terms risk domain.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2025-25446
Vulnerability details
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an…
more
HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user. This vulnerability is fixed in 0.10.1.1.
- CWE(s)
AI Security AnalysisAI
- AI Category
- NLP and Transformers
- Risk Domain
- Other ATLAS/OWASP Terms
- OWASP Top 10 for LLMs 2025
- None mapped
- Classification Reason
- Matched keywords: llms, vllm
Related Threats
Affected Assets
Mitigating Controls
Likely Mitigating Controls AI
Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.
Limiting concurrent sessions directly prevents uncontrolled resource consumption by capping the number of active sessions per user or account.
Analysis identifies uncontrolled resource consumption indicative of denial-of-service or abuse attempts.
Contingency plan testing includes resource exhaustion scenarios to verify recovery, making it harder for attackers to sustain exploits that cause uncontrolled consumption.
Updated contingency plans include current procedures to detect, contain, and recover from resource exhaustion, limiting an attacker's ability to sustain impact from uncontrolled consumption.
Alternate site allows resumption of operations if resource exhaustion at the primary site is exploited to cause unavailability.
Alternate telecommunications services enable resumption of essential functions when primary services become unavailable due to uncontrolled resource consumption.
The team can analyze and respond to resource exhaustion incidents, reducing the impact of attacks that exploit uncontrolled consumption weaknesses.
Timely maintenance support and spare parts enable rapid recovery from failures induced by uncontrolled resource consumption, shortening the impact window of denial-of-service attacks.