Cyber Resilience

AI-Related Software Vulnerabilities: 2025–2026

Last updated: 2026-07-03

AI and machine-learning software has generated its own steady stream of CVEs in deep-learning frameworks, model-serving stacks, LLM application platforms, agent frameworks and enterprise AI assistants. This article examines that body of AI-related software as a whole and compares it to the rest of the vulnerability corpus across 2025 and the first months of 2026: how severe its vulnerabilities are, how likely they are to be exploited, which weaknesses dominate, and who builds the affected software.

Scope and data

We draw on the National Vulnerability Database (NVD) and the Exploit Prediction Scoring System (EPSS). A CVE is counted as AI-related when our annotation pipeline classifies the affected software as AI/ML technology. The categories we track, with a few well-known examples of each, are:

Type of AI-related softwareWell-known examples
Hosted model APIs & modelsOpenAI / GPT, Anthropic / Claude, Google Gemini
Enterprise AI assistantsMicrosoft 365 Copilot, GitHub Copilot, Security Copilot
AI agent protocols & integrationsModel Context Protocol (MCP), Agent2Agent (A2A), LangChain agents
Deep-learning frameworksTensorFlow, PyTorch, Keras
Machine-learning librariesscikit-learn, XGBoost, MLflow
Data-processing librariesNumPy, Pandas
NLP & transformer librariesHugging Face Transformers, LangChain, llama.cpp
Classical NLP librariesspaCy, NLTK, Gensim
Computer visionOpenCV, Pillow, torchvision
Vector / similarity searchFAISS, Pinecone, Weaviate
LLM application platforms & other AI softwareOpen WebUI, Dify, LibreChat

We single out software implementing the Model Context Protocol (MCP) as a highlighted subset in several charts, because it turns out to be the most divergent subcategory. Unless noted, charts cover CVEs published in 2025; the growth and year-over-year sections extend into 2026.

Summary findings

The AI attack surface is growing fast

AI-related CVE publication climbed through 2025 and accelerated sharply into 2026: a cohort that produced about 780 CVEs in all of 2025 had already published more than 1,150 by early July 2026. Monthly counts that hovered between roughly 35 and 105 through 2025 reached 347 in May 2026. The non-AI corpus grew too, but nowhere near as steeply; the AI share of new CVEs is rising, not holding constant. Counts here are as of 2026-07-03 and were re-based after a keyword false-positive correction removed 1,252 misclassified CVEs from the AI cohort — the same grading discipline we apply in our mapping-accuracy audit; earlier drafts cited about 870 CVEs for 2025.

Figure 1
Figure 1. Monthly CVE volume and year-over-year totals, AI-related vs non-AI. The 2026 series runs through the last complete month; 2026 is already on pace to more than double 2025.

Severity

Vulnerabilities in AI-related software are more severe than vulnerabilities in other software. The average CVSS base score is 7.06, against 6.61 for the non-AI baseline; the difference is statistically significant (Welch's t-test, p < 0.001). Roughly 18% of AI-related CVEs are rated Critical, about double the 8.7% Critical rate in the rest of the corpus, and roughly 54% are High or Critical. The MCP subset is higher still, at an average of 7.62.

Figure 2
Figure 2. Severity distribution and CVSS base-score spread for 2025 CVEs, with the MCP subset shown for reference.

The CVSS fingerprint

Rather than chart each CVSS vector component separately, Figure 3 shows the share of each cohort sitting at the high-risk value of every component at once. The shape is the story. AI-related vulnerabilities are markedly more network-reachable (89% have a Network attack vector versus 73% for non-AI software) and require fewer privileges to exploit (59% need none, versus 48%). They also land harder when exploited, with elevated High-impact rates for confidentiality (48% vs 38%) and integrity (39% vs 31%).

Two components deliberately do not separate the cohorts. Attack complexity is low for about 91% of CVEs in both groups, and required user interaction is, if anything, marginally lower for AI software. This is worth stating plainly: the "agentic, probabilistic, high-complexity" exploitation pattern that characterised the MCP subset is not a property of AI-related software in general. The general cohort looks like ordinary network-exposed application software, only more so on reachability and impact.

Why confidentiality runs high. AI and especially LLM-application software routinely handles high-value secrets: API keys and model credentials, user prompts containing PII or business data, internal knowledge bases, and live context shuttled between tools. A successful exploit frequently means direct exposure of that data, which pushes confidentiality ratings up.
Figure 3
Figure 3. Share of 2025 CVEs sitting at the high-risk value of each CVSS vector component. AI-related software skews network-reachable, low-privilege and high-impact; attack complexity and user-interaction are not differentiators.

Which AI software is riskiest?

The AI cohort is not monolithic. Broken into subcategories, two patterns stand out. First, the self-hosted LLM application layer dominates by volume: LLM Application Platforms (Open WebUI, Dify, LibreChat, Lunary and similar) is now the single largest subcategory at 291 CVEs in 2025, with high severity (average 7.2) and the second-highest near-term exploitation risk by EPSS. Second, the infrastructure of agency, the agent protocols and integrations that wire models to tools, tops both measures: the highest average severity (7.6) and the highest EPSS-based exploitation risk. The transformer / model-serving and ML-library layers (vLLM, Ollama, LlamaIndex, MLflow) sit just behind at 7.4–7.5. Enterprise AI assistants are more moderate (7.0), and the classic deep-learning frameworks rank lowest (6.5), though far from harmless.

The takeaway for defenders: the closer a piece of AI software sits to taking actions on a user's behalf (brokering model calls, wiring agents to tools, executing generated code), the worse its vulnerabilities tend to be.

Figure 4
Figure 4. Average severity (left) and near-term exploitation risk (right) for the eight largest AI subcategories, 2025.

Common weaknesses

This is where the general AI cohort diverges most sharply from MCP. The MCP subset was dominated by command injection: CWE-78 (30%) and CWE-77 (23%) accounted for more than half of its CVEs. The wider AI cohort has no such dominant weakness. Its top weaknesses are cross-site scripting (CWE-79, 9.6%), code injection (CWE-94, 7.2%), missing authorization (CWE-862, 6.8%), unsafe deserialization (CWE-502, 6.6%) and server-side request forgery (CWE-918, 5.7%), with command injection present but no longer dominant.

The profile reads like that of a fast-moving web-application ecosystem: model-serving and LLM-app projects expose HTTP endpoints, render model output into web UIs (XSS), fetch remote resources on the model's behalf (SSRF), load serialized model artifacts (deserialization) and bolt authentication on late (missing authorization). Notably, unsafe deserialization and SSRF are far more prevalent here than in the non-AI baseline, a direct consequence of how AI software loads model files and reaches out to external services.

Figure 5
Figure 5. Top 12 weaknesses in the AI-related cohort, with non-AI and the MCP subset for contrast.

Who builds it, and the open-source question

The vendors most represented in AI-related CVEs are, with few exceptions, open-source LLM-application and agent projects: the Linux Foundation, Dify, Lunary, vLLM, LibreChat, Hugging Face, Cursor and Open WebUI among them, alongside a small number of proprietary vendors such as Microsoft. Overall, 59% of 2025 AI-related CVEs affect open-source software and 41% proprietary.

This open-source skew cuts two ways. It partly reflects transparency (anyone can file a CVE against an open project, and these communities do) rather than open source being inherently less safe. But it also reflects a genuine reality: much of the AI application layer is being built in fast-moving open-source projects that have not yet accumulated the security hardening of mature web frameworks. For consumers, the practical implication is that an AI deployment's risk surface is largely a supply-chain surface.

Figure 6
Figure 6. Most-affected vendors and the open-source / proprietary split of AI-related CVEs, 2025.

Exploitability

CISA's Known Exploited Vulnerabilities (KEV) catalog lists only about one in 300 CVEs and is too sparse to characterise a cohort this size, so we use EPSS, which estimates the probability of exploitation in the wild. By EPSS, AI-related vulnerabilities carry materially higher near-term exploitation risk: a mean of 0.017 versus 0.010 for non-AI software, with 1.3% scoring above 0.5 (versus 0.7%). The difference between the distributions is large and highly significant (Mann–Whitney, p < 0.001). The MCP subset is higher again, at 3.0% above 0.5. In short, the emerging AI ecosystem does not just produce more severe vulnerabilities; it produces vulnerabilities that are more likely to be exploited soon.

Figure 7
Figure 7. Cumulative distribution of EPSS scores (log-log). A curve lower and to the right carries more high-probability vulnerabilities.

Are threat actors targeting AI software yet?

Here the answer is: not visibly, not yet. Across the roughly 1,900 AI-related CVEs published in 2025 and 2026, our CVE-to-actor attribution data ties exactly one to named threat actors as of 2026-07-03: CVE-2026-22813, an HTML-injection flaw in the open-source AI coding agent OpenCode. And that attribution is a cautionary tale: it links a single open-source XSS bug to five unrelated state-aligned groups at once (Gamaredon, MuddyWater, Kimsuky, Mustang Panda and Volt Typhoon), a pattern that says far more about the noise in automated attribution than about any real campaign.

The gap between attack surface and attribution is the point. The vulnerabilities are here, they are severe, and they are increasingly exploitable, but public attribution of who is exploiting AI software lags far behind. Defenders who wait for a named actor before treating this surface as a target will be waiting a long time. The leading indicators are the documented, unattributed exploitation cases in the MCP world: the unauthenticated RCE in MCP Inspector (CVE-2025-49596) and the command injection in mcp-remote (CVE-2025-6514), not a threat-intel label.

Year over year: did 2026 get worse?

So far, 2026 looks like more of 2025 rather than something qualitatively worse. The volume has surged, but the severity profile is essentially unchanged.

AI-related cohort2025 (full year)2026 (to date)
CVEs published7801,152
Average CVSS base score7.067.00
High or Critical53.6%54.8%

EPSS scores appear lower for the 2026 cohort, but that is an artefact rather than good news: EPSS for very recently published CVEs has not yet matured, so a current-year cohort will always look quieter than it eventually proves to be. The volume trend is the real story.

Recommendations

For software producers

The weakness profile points to well-understood defenses applied to a new context. Treat AI-application code as the network-exposed web software it is: sanitize model output before rendering it (XSS), constrain and validate outbound requests (SSRF), avoid unsafe deserialization of model artifacts and untrusted payloads, and enforce authentication and authorization on every endpoint by default rather than as an afterthought. For agent and tool-calling components, add AI-specific controls on top: sandboxing, least-privilege scoping of the host context, tool signing/verification, and human-in-the-loop confirmation for sensitive actions.

For enterprises (software consumers)

Because the AI risk surface is overwhelmingly a supply-chain surface, inventory the open-source AI components in your stack the way you would any dependency, and track their advisories. Ask vendors and projects for penetration-test results focused on the weakness classes above, and prioritize patching for the agentic and API-brokering components that this data shows are both the most severe and the most likely to be exploited. Don't wait for threat-actor attribution to treat this surface as a target.


Author: Arve Kjoelen. Data: NVD and EPSS. Charts from the 10 Jun 2026 20:47 UTC snapshot; cohort counts, severity shares and attribution figures refreshed 2026-07-03 after the keyword false-positive correction. Cohorts defined by the project's AI-classification pipeline; the non-AI baseline is all other CVEs published in the same window. Counts reflect CVEs carrying the relevant data (e.g. a CVSS vector or an EPSS score) and exclude those that do not. Feel free to share, adapt, or build on this work; attribution appreciated but not required.