Cyber Posture

CVE-2026-33298

HighPublic PoC

Published: 24 March 2026

Published
24 March 2026
Modified
30 April 2026
KEV Added
Patch
CVSS Score 7.8 CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
EPSS Score 0.0002 4.4th percentile
Risk Priority 16 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2026-33298 is a high-severity Heap-based Buffer Overflow (CWE-122) vulnerability in Ggml Llama.Cpp. Its CVSS base score is 7.8 (High).

Operationally, exploitation aligns with the MITRE ATT&CK technique Exploitation for Client Execution (T1203); ranked at the 4.4th percentile by exploit likelihood (below the median); it is not currently listed in the CISA KEV catalog; a public proof-of-concept is referenced.

This vulnerability is AI-related — categorised as NLP and Transformers.

The strongest mitigations our analysis identified are NIST 800-53 SI-10 (Information Input Validation) and SI-2 (Flaw Remediation).

Threat & Defense at a Glance

What attackers do: exploitation maps to Exploitation for Client Execution (T1203) and 1 other technique. What defenders deploy: see the NIST 800-53 controls recommended below.
Threat & Defense Details

Mitigating Controls (NIST 800-53 r5)AI

prevent

Updating to llama.cpp b7824 or later directly remediates the integer overflow in ggml_nbytes, preventing heap buffer overflows from crafted GGUF files.

prevent

Validating tensor dimensions and sizes in GGUF files prior to processing prevents integer overflows that bypass memory validation.

prevent

Memory protection mechanisms such as ASLR and DEP mitigate exploitation of the resulting heap buffer overflow for remote code execution.

MITRE ATT&CK Enterprise TechniquesAI

T1203 Exploitation for Client Execution Execution
Adversaries may exploit software vulnerabilities in client applications to execute code.
T1204.002 Malicious File Execution
An adversary may rely upon a user opening a malicious file in order to gain execution.
Why these techniques?

The integer/heap overflow in GGUF tensor parsing is directly triggered by a crafted malicious file supplied to a local client application (llama.cpp), enabling client-side code execution via exploitation (T1203) after user interaction with the file (T1204.002).

Confidence: HIGH · MITRE ATT&CK Enterprise v18.1

NVD Description

llama.cpp is an inference of several LLM models in C/C++. Prior to b7824, an integer overflow vulnerability in the `ggml_nbytes` function allows an attacker to bypass memory validation by crafting a GGUF file with specific tensor dimensions. This causes `ggml_nbytes`…

more

to return a significantly smaller size than required (e.g., 4MB instead of Exabytes), leading to a heap-based buffer overflow when the application subsequently processes the tensor. This vulnerability allows potential Remote Code Execution (RCE) via memory corruption. b7824 contains a fix.

Deeper analysisAI

CVE-2026-33298, published on 2026-03-24, is an integer overflow vulnerability (CWE-190) combined with a heap-based buffer overflow (CWE-122) in the `ggml_nbytes` function of llama.cpp, a C/C++ inference engine for large language models (LLMs). Versions prior to b7824 are affected. The flaw allows an attacker to bypass memory validation by crafting a GGUF file with specific tensor dimensions, causing `ggml_nbytes` to return a significantly smaller size than required—such as 4MB instead of exabytes—leading to memory corruption when the tensor is processed. It carries a CVSS v3.1 base score of 7.8 (AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H).

The attack requires local access and user interaction, with no privileges needed. An attacker can supply a malicious GGUF file to a target user running a vulnerable llama.cpp application, tricking them into loading it for LLM inference. This triggers the integer overflow during size calculation, resulting in a heap buffer overflow and potential remote code execution (RCE) through memory corruption.

Mitigation is addressed in the official GitHub security advisory (GHSA-96jg-mvhq-q7q7) and release tag b7824, which contains a fix for the `ggml_nbytes` function. Security practitioners should update to llama.cpp b7824 or later to prevent exploitation.

This vulnerability holds relevance for AI/ML deployments relying on llama.cpp for lightweight, local LLM inference, highlighting risks in file-processing components of such frameworks. No public evidence of real-world exploitation is available.

Details

CWE(s)

Affected Products

ggml
llama.cpp
≤ b7824

AI Security AnalysisAI

AI Category
NLP and Transformers
Risk Domain
N/A
OWASP Top 10 for LLMs 2025
None mapped
Classification Reason
Matched keywords: llama.cpp, llm

CVEs Like This One

CVE-2026-34159Same product: Ggml Llama.Cpp
CVE-2026-21869Same product: Ggml Llama.Cpp
CVE-2026-41445Shared CWE-122, CWE-190
CVE-2025-21172Shared CWE-122, CWE-190
CVE-2026-34545Shared CWE-122, CWE-190
CVE-2026-27940Shared CWE-122, CWE-190
CVE-2026-23719Shared CWE-122
CVE-2025-35984Shared CWE-122
CVE-2025-27173Shared CWE-122
CVE-2025-50129Shared CWE-122

References