CVE-2024-34359
Published: 14 May 2024
Summary
CVE-2024-34359 is a critical-severity Improper Neutralization of Equivalent Special Elements (CWE-76) vulnerability. Its CVSS base score is 9.6 (Critical).
Operationally, exploitation aligns with the MITRE ATT&CK technique Exploitation for Client Execution (T1203); ranked in the top 1.6% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.
This vulnerability is AI-related — categorised as NLP and Transformers; in the LLM/Generative AI Risks risk domain; MITRE ATLAS techniques in scope: AI Supply Chain Compromise (AML.T0010).
Deeper analysis
llama-cpp-python, the Python bindings for llama.cpp, contains a server-side template injection vulnerability in the Llama class within llama.py. When the constructor loads a .gguf model file it extracts the chat template from the file metadata and passes it to Jinja2ChatFormatter, which instantiates an unsandboxed jinja2.Environment and later renders the template in its __call__ method. This flaw, tracked as CWE-76, permits arbitrary code execution and carries a CVSS 3.1 score of 9.6.
An attacker who can supply or substitute a malicious .gguf file can embed a crafted chat template that executes code when the model is initialized or used for chat interactions. Because the vulnerable code path runs with the privileges of the Python process loading the model, successful exploitation yields remote code execution on the host.
Public advisories and patches are available in the GitHub Security Advisory GHSA-56xg-wfcc-g829 and the corresponding commit b454f40a9a1787b2b5659cd2cb00819d983185df, which address the unsafe Jinja2 usage.
The component is widely used to run large-language models, placing the issue in an AI/ML context. The associated EPSS score has remained near 0.62 with a recorded peak of 0.6264.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2024-1433
Vulnerability details
llama-cpp-python is the Python bindings for llama.cpp. `llama-cpp-python` depends on class `Llama` in `llama.py` to load `.gguf` llama.cpp or Latency Machine Learning Models. The `__init__` constructor built in the `Llama` takes several parameters to configure the loading and running of…
more
the model. Other than `NUMA, LoRa settings`, `loading tokenizers,` and `hardware settings`, `__init__` also loads the `chat template` from targeted `.gguf` 's Metadata and furtherly parses it to `llama_chat_format.Jinja2ChatFormatter.to_chat_handler()` to construct the `self.chat_handler` for this model. Nevertheless, `Jinja2ChatFormatter` parse the `chat template` within the Metadate with sandbox-less `jinja2.Environment`, which is furthermore rendered in `__call__` to construct the `prompt` of interaction. This allows `jinja2` Server Side Template Injection which leads to remote code execution by a carefully constructed payload.
- CWE(s)
AI Security AnalysisAI
- AI Category
- NLP and Transformers
- Risk Domain
- LLM/Generative AI Risks
- OWASP Top 10 for LLMs 2025
- None mapped
- Classification Reason
- llama-cpp-python provides Python bindings for llama.cpp, used for loading and running Llama GGUF models, which are transformer-based large language models for NLP tasks. The vulnerability occurs during model loading and chat template parsing specific to LLM inference.
Related Threats
MITRE ATT&CK Enterprise TechniquesAI
Why these techniques?
CVE-2024-34359 enables Server-Side Template Injection (SSTI) via crafted Jinja2 chat templates in .gguf model metadata, leading to remote code execution when loaded by llama-cpp-python. This facilitates Exploitation for Client Execution (T1203) and Template Injection (T1221).
MITRE ATLAS TechniquesAI
MITRE ATLAS techniques
Affected Assets
Mitigating Controls
No mitigating controls mapped yet. The per-CVE control annotator has not reached this CVE.