CVE-2026-27940
Published: 12 March 2026
Summary
CVE-2026-27940 is a high-severity Heap-based Buffer Overflow (CWE-122) vulnerability. Its CVSS base score is 7.8 (High).
Operationally, exploitation aligns with the MITRE ATT&CK technique Malicious File (T1204.002); ranked at the 5.6th percentile by exploit likelihood (below the median); it is not currently listed in the CISA KEV catalog.
This vulnerability is AI-related — categorised as NLP and Transformers.
The strongest mitigations our analysis identified are NIST 800-53 SI-16 (Memory Protection) and SI-2 (Flaw Remediation).
Threat & Defense at a Glance
Threat & Defense Details
Mitigating Controls (NIST 800-53 r5)AI
Requires timely identification, reporting, and patching of flaws like the integer overflow in gguf_init_from_file_impl(), directly preventing exploitation of CVE-2026-27940.
Implements memory protections such as non-executable heap regions that block arbitrary code execution from heap buffer overflows triggered by malicious GGUF files.
Mandates validation of information inputs like GGUF files to detect and reject malformed data that could trigger the undersized allocation and subsequent fread overflow.
MITRE ATT&CK Enterprise TechniquesAI
Why these techniques?
Vulnerability is triggered by a user loading a malicious GGUF model file into llama.cpp, directly enabling RCE via heap overflow; maps to user execution of malicious file (T1204.002).
NVD Description
llama.cpp is an inference of several LLM models in C/C++. Prior to b8146, the gguf_init_from_file_impl() in gguf.cpp is vulnerable to an Integer overflow, leading to an undersized heap allocation. Using the subsequent fread() writes 528+ bytes of attacker-controlled data past…
more
the buffer boundary. This is a bypass of a similar bug in the same file - CVE-2025-53630, but the fix overlooked some areas. This vulnerability is fixed in b8146.
Deeper analysisAI
CVE-2026-27940 is an integer overflow vulnerability in the gguf_init_from_file_impl() function within gguf.cpp of llama.cpp, a C/C++ inference engine for large language models (LLMs). Affecting versions prior to commit b8146, the flaw triggers an undersized heap allocation, enabling a subsequent fread() operation to write over 528 bytes of attacker-controlled data beyond the buffer boundary, resulting in a heap buffer overflow. This issue is classified under CWE-122 (Heap-based Buffer Overflow) and CWE-190 (Integer Overflow or Wraparound), with a CVSS v3.1 base score of 7.8.
The vulnerability requires a local attacker with no privileges who can trick a user into loading a malicious GGUF file into llama.cpp (AV:L/AC:L/PR:N/UI:R). Successful exploitation allows arbitrary code execution or system compromise due to the high impacts on confidentiality, integrity, and availability (C:H/I:H/A:H), as the overflow permits overwriting adjacent heap memory with controlled data.
The GitHub security advisory at https://github.com/ggml-org/llama.cpp/security/advisories/GHSA-3p4r-fq3f-q74v confirms the issue as a bypass of a prior similar vulnerability (CVE-2025-53630) in the same file, where the fix overlooked certain code paths. Mitigation involves updating to commit b8146 or later.
This flaw is particularly relevant in AI/ML contexts, as llama.cpp is widely used for efficient LLM inference on local systems, potentially exposing practitioners running unpatched inference workloads to local file-based attacks. No public evidence of real-world exploitation has been reported.
Details
- CWE(s)
AI Security AnalysisAI
- AI Category
- NLP and Transformers
- Risk Domain
- N/A
- OWASP Top 10 for LLMs 2025
- None mapped
- Classification Reason
- Matched keywords: llama.cpp, llm