Cyber Resilience

CVE-2025-32381

MediumDDoS

Published: 09 April 2025

Published
09 April 2025
Modified
17 September 2025
KEV Added
Patch
CVSS Score v3.1 6.5 CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
EPSS Score 0.0035 58.1th percentile
Risk Priority 13 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2025-32381 is a medium-severity Allocation of Resources Without Limits or Throttling (CWE-770) vulnerability in Mlc-Ai Xgrammar. Its CVSS base score is 6.5 (Medium).

Operationally, exploitation aligns with the MITRE ATT&CK technique Application Exhaustion Flood (T1499.003); ranked in the top 41.9% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.

This vulnerability is AI-related — categorised as NLP and Transformers; in the Supply Chain and Deployment risk domain.

EU & UK References

Vulnerability details

XGrammar is an open-source library for efficient, flexible, and portable structured generation. Prior to 0.1.18, Xgrammar includes a cache for compiled grammars to increase performance with repeated use of the same grammar. This cache is held in memory. Since the…

more

cache is unbounded, a system making use of xgrammar can be abused to fill up a host's memory and case a denial of service. For example, sending many small requests to an LLM inference server with unique JSON schemas would eventually cause this denial of service to occur. This vulnerability is fixed in 0.1.18.

CWE(s)

AI Security AnalysisAI

AI Category
NLP and Transformers
Risk Domain
Supply Chain and Deployment
OWASP Top 10 for LLMs 2025
None mapped
Classification Reason
Matched keywords: llm

Related Threats

MITRE ATT&CK Enterprise TechniquesAI

T1499.003 Application Exhaustion Flood Impact
Adversaries may target resource intensive features of applications to cause a denial of service (DoS), denying availability to those applications.
T1499.004 Application or System Exploitation Impact
Adversaries may exploit software vulnerabilities that can cause an application or system to crash and deny availability to users.
Why these techniques?

Unbounded in-memory cache for compiled grammars enables remote memory exhaustion via repeated unique requests to applications like LLM inference servers, facilitating endpoint DoS through application exhaustion flood and exploitation.

Affected Assets

mlc-ai
xgrammar
≤ 0.1.18

Mitigating Controls

Likely Mitigating Controls AI

Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.

addresses: CWE-770

This control implements explicit throttling on session allocation, addressing the weakness of allocating resources without limits.

addresses: CWE-770

Plan testing exercises resource allocation limits and throttling during simulated failures, directly addressing weaknesses that allow unbounded resource use.

addresses: CWE-770

Contingency plan updates ensure recovery strategies address unbounded resource allocation, making it harder for attackers to exploit lack of throttling to cause prolonged outages.

addresses: CWE-770

Provides continuity when unbounded resource allocation at the primary site leads to exhaustion and downtime.

addresses: CWE-770

Alternate services allow operations to continue when primary allocation of resources lacks limits or throttling.

addresses: CWE-770

Explicit planning of security-related actions requires defining limits, windows, and resource allocations, making allocation without throttling far less likely.

addresses: CWE-770

Measures of performance include tracking allocation behavior and throttling effectiveness, reducing the window for resource exhaustion attacks.

addresses: CWE-770

Imposes an inactivity-based limit on network resource allocation, throttling the number of concurrently held connections.

References