CVE-2025-23311
Published: 06 August 2025
Summary
CVE-2025-23311 is a critical-severity Stack-based Buffer Overflow (CWE-121) vulnerability in Nvidia Triton Inference Server. Its CVSS base score is 9.8 (Critical).
Operationally, exploitation aligns with the MITRE ATT&CK technique Exploit Public-Facing Application (T1190); ranked in the top 17.5% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.
The strongest mitigations our analysis identified are NIST 800-53 SI-2 (Flaw Remediation) and SI-10 (Information Input Validation).
Deeper analysis
NVIDIA Triton Inference Server is affected by CVE-2025-23311, a stack-based buffer overflow vulnerability (CWE-121) that can be triggered by specially crafted HTTP requests. The flaw carries a CVSS 3.1 score of 9.8 and may result in remote code execution, denial of service, information disclosure, or data tampering.
An unauthenticated attacker with network access can exploit the issue by sending malicious HTTP requests, potentially compromising the inference server without any user interaction or credentials.
The referenced NVIDIA security advisory at nvidia.custhelp.com provides official guidance on the vulnerability. EPSS remains low and flat at 0.0167 with no observed increase after disclosure.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2025-23840
Vulnerability details
NVIDIA Triton Inference Server contains a vulnerability where an attacker could cause a stack overflow through specially crafted HTTP requests. A successful exploit of this vulnerability might lead to remote code execution, denial of service, information disclosure, or data tampering.
- CWE(s)
Related Threats
MITRE ATT&CK Enterprise TechniquesAI
Why these techniques?
Stack-based buffer overflow in public-facing Triton Inference Server triggered by crafted HTTP requests directly enables remote code execution via T1190 Exploit Public-Facing Application.
CVEs Like This One
Affected Assets
Mitigating Controls
Mitigating Controls (NIST 800-53 r5) AI
Directly requires identification, reporting, and timely remediation of the stack-based buffer overflow flaw in NVIDIA Triton Inference Server via patching.
Mandates validation of HTTP request inputs to block specially crafted requests that trigger the stack overflow vulnerability.
Implements memory protection mechanisms such as stack canaries, ASLR, and non-executable stacks to mitigate exploitation of the stack overflow for RCE or data tampering.