Reading cross-walks: how the framework chips work

Vulnerability frameworks talk past each other. OWASP describes risk categories; NIST 800-53 describes controls; CSF describes outcomes; CWE describes weaknesses; ATT&CK describes adversary techniques; DISA STIGs describe per-OS configuration rules. They’re useful for different audiences — but the moment you need to answer “does this NIST control address OWASP A01?” or “which weaknesses does T1190 exploit?” you need a cross-walk. This page explains where the chips you see on framework pages come from, how to read them, and what changed in the most recent overhaul of the verb dictionary behind them.

What a cross-walk is

A cross-walk row says “entry A in framework X relates to entry B in framework Y”, with a verb that names the kind of relationship and an extent qualifier: partial, mostly, or full. We don’t write rows for the absence — if you don’t see a chip, the LLM and the reviewer both think there’s no meaningful connection.

Some mappings are authoritative

A few framework pairs come with official cross-walks that we ingest verbatim:

NIST CSF 2.0 → 800-53 r5. NIST publishes the mapping in the Cybersecurity & Privacy Reference Tool. 738 directed edges across the 106 CSF subcategories.
DISA STIG → NIST 800-53. Each STIG rule references one or more CCIs (Common Control Identifiers). DISA publishes the CCI→800-53 mapping; we compose rule→CCI→800-53 deterministically. 4,652 edges, no LLM.
CWE ↔ CAPEC and CWE ↔ ATT&CK candidate sets. MITRE publishes the Related_Attack_Patterns bridge in the CWE catalog; we walk it to seed the candidate set our authoring pass rates.
OWASP Top 10 for Web 2025 → CWE. Each OWASP category is a CWE meta-category in MITRE’s catalog; membership is published by MITRE.

These show up as chips with no “reviewed by us” caveat — they’re as reliable as the publishing authority in the direction it publishes. And that direction is the catch: every one of them is one-way (note the arrows above). The reverse direction is a different matter — see the next section.

One-way authority, two-way reality

The authoritative mappings only ever assert the forward edge. NIST says CSF → 800-53; MITRE says OWASP-Web category → member CWEs. Neither publishes the reverse, yet a control page needs to answer “which outcomes point at me?” and a weakness page needs “which categories cover me?” Inverting a one-way mapping to fake that reverse is misleading: the set of subcategories that happen to cite a control is a by-product of the forward mapping, not a curated statement that the control supports those outcomes.

So for these pairs we now author our own two-way mapping — an LLM rates both directions with an extent, and the rows go through the same QA queue as every other interpretive pair. On a control page (CSF ↔ 800-53) and a CWE page (OWASP-Web ↔ CWE) you get a small toggle:

Our mapping (default) — our two-way reading, including edges the authority never listed.
Authoritative only — just the official one-way edges, nothing we added.

Because we rate every authoritative edge as well as our own, we can show where the two disagree in the direction the authority actually publishes. A chip is badged ≠ when it is either a conflict (the authority maps it but our reviewer found no meaningful link) or an addition (we map it but the authority doesn’t). Pure differences of degree — we say mostly where the authority simply says “mapped” — are not flagged, because the authority carries no extent to disagree with. The point isn’t that the authority is wrong; it’s that “authoritative” and “bidirectional” are not the same thing, and the badge keeps the two honest.

Each two-way mapping ships as a release with its own reliability and abstraction measures, a diff against the authoritative original (where one exists), and downloadable raw data (JSON / CSV / XLSX). See all cross-walk releases.

The rest needs interpretation

Most useful cross-walks don’t exist as official data: OWASP↔NIST, ASVS↔CWE, ASVS↔NIST, CSF↔CWE, CWE↔STIG, CWE↔ATT&CK extents. The relationships are real and load-bearing for practitioners, but no organisation publishes them. We fill the gap with a two-pass process:

LLM authoring. For each source entry, we narrow the target framework to the top 8–10 most-likely candidates by family seed + text overlap (and, for CWE↔ATT&CK, by the MITRE-published bridge), then ask a frontier model to evaluate each pair in both directions and return an extent rating + a one-sentence rationale.
Human QA. Every LLM-authored row sits in a review queue under authority: "llm_unverified" until a person promotes it to manual_QA. Confident rows (mostly/full in both directions) can be batch-promoted; the rest get per-pair review.

Why “covers” stopped being enough

For most of the project, the canonical verb was a single word — “covers”. The convention read: “A covers B by X” means “if you implement A, X of B is addressed.” That works cleanly when both endpoints are controls: a narrow ASVS check covers a broad NIST control partially; the broad NIST control covers the narrow ASVS check fully. Asymmetry comes from breadth-of-scope.

It breaks the moment the two endpoints aren’t the same kind of thing. Try saying out loud: “T1190 covers CVE-2026-1234.” It doesn’t parse. The right sentence is “T1190 exploits CVE-2026-1234,” or from the other direction “CVE-2026-1234 is susceptible to T1190.” The verb is direction-natural; one word doesn’t do both sides.

The author-prompt history showed the same friction. CWE↔ATT&CK rows were getting authored under a prompt that asked the LLM to rate “covers” in both directions, and the model had to twist the verb into “the technique exploits one slice of the weakness’s exploitation surface, so attack covers cwe = partial.” That worked, but it leaned on a particular reading of “covers” that drifted between author runs.

Two classes of mapping

We rewrote the convention into two classes (settled 2026-06-03; canonical doc at docs/xwalk-taxonomy.md in the repo):

Class 1 — same-category, shared verb. Both endpoints are the same kind. One verb works both ways; asymmetry comes from breadth-of-scope. Examples: Control↔Control uses covers; Threat↔Threat uses subsumes / subsumed_by; Vuln-class↔Vuln-class uses generalizes / specializes.
Class 2 — cross-category, two distinct verbs. The two endpoints are different kinds. Each direction has its own direction-natural verb and its own independently-rated extent. The two extents can disagree, because the questions are different. CWE→ATT&CK uses enables in the forward direction (“weakness enables technique”) and exploits in the reverse (“technique exploits weakness”). Control→Threat uses mitigates / is_mitigated_by. CVE→Threat uses is_susceptible_to / exploits. CWE→Asset uses manifests_in / is_prone_to.

The full verb dictionary has 14 entries. Every row in controls_xwalks now carries a relation field naming the verb the rating was authored under, plus an extent field with the none / partial / mostly / full rating.

A chip on a category page renders like this: AC-3 ←F →P The arrow marks read: ← the other side does the verb to this (full); → this does the verb to the other side (partial). The verb itself appears in the tooltip on hover, e.g. “AC-3 mitigates T1078 (partial); T1078 is mitigated by AC-3 (full).”

What the verb change actually moved

We re-authored every CWE↔ATT&CK and CWE↔CAPEC pair under the new Class-2 prompts, then promoted the results to authority: "manual_QA_v2" alongside the preserved-but-archived v1 rows. The sweep covered 1,320 unique pairs (343 CWE↔ATT&CK + 977 CWE↔CAPEC), 2,640 directed rows. Compared to the v1 ratings under covers:

30% same (396 pairs) — verb change preserved the rating.
54% shifted (707 pairs) — same direction in extent space, different specific level.
16% flipped (217 pairs) — one or both directions crossed the none boundary.

The flips were the most informative. Headline pattern: CWE-200 (Information Exposure) versus ATT&CK reconnaissance techniques — T1016, T1046, T1049, T1083, T1135, T1595, T1111. Under covers, every pair rated partial / partial: thematic adjacency was being read as “CWE-200 covers T1083 partially, T1083 covers CWE-200 partially.” Under enables, the forward direction (does the weakness enable the technique?) drops to none for every reconnaissance technique in the cluster: T1083 (File and Directory Discovery) succeeds via standard OS utilities regardless of whether an info-exposure flaw exists; T1135 (Network Share Discovery) succeeds via protocol enumeration; T1595 (Active Scanning) gathers infrastructure details without needing CWE-200. The verb forced the harder question: is the weakness necessary for the technique to land? Often the honest answer is no. The reverse direction stays partial — exploits asks how much of the weakness’s surface the technique uses, and the answer is “a slice.”

Headline pattern on the other pair: CWE→CAPEC shifted up. Under covers, most CWE→CAPEC rows read partial because the narrow CAPEC was seen as capturing only a slice of the broad CWE’s surface. Under enables, the question becomes “is the weakness necessary for the attack pattern?” — and for a specific CAPEC variant the answer is usually full. CAPEC-209 (XSS via MIME-Type Mismatch) reads “CWE-79 enables CAPEC-209 = full” (eliminate XSS, the MIME trick has nowhere to land) and “CAPEC-209 exploits CWE-79 = partial” (one narrow variant, dozens of others exist).

The ATLAS purity audit

Before pulling MITRE ATLAS into the new schema we audited it for AI specificity. ATLAS’s 170 techniques cover an AI-focused threat space, but a non-trivial share of the entries are essentially ATT&CK techniques with the AML-T identifier stamped on. The audit (data/atlas_purity_audit.csv) scores each technique on AI-keyword density (in title + description), generic-IT keyword density, and whether the description cites an ATT&CK parent technique. The split:

107 keep (63%) — clearly AI-specific.
31 scope-narrow (18%) — some AI content but cites ATT&CK parents; map only the AI-specific slice, defer the rest to ATT&CK.
32 drop (19%) — no real AI content; the entry is a rebranded general-IT attack.

Examples of the drops: AML.T0049 “Exploit Public-Facing Application” is verbatim ATT&CK T1190 with the AML-T id stamped on; AML.T0050 “Command and Scripting Interpreter” is verbatim T1059; AML.T0072 “Reverse Shell” describes generic reverse shell behaviour; AML.T0048.* (External Harms / Financial / Reputational / Societal / User) are outcome categories rather than attack techniques. None of these will get ATLAS cross-walk rows; the ATT&CK row already covers them. The keep list goes through Class-2 authoring against ATT&CK using subsumes / subsumed_by (Threat↔Threat) so the AI-specific framing stays distinguishable.

Honest caveats

These cross-walks are interpretive guidance, not legal or compliance certifications. The LLM is fast and consistent but not infallible; the human reviewer is slower but still capable of error. Treat chips as starting points for analysis, not as substitutes for it. For high-stakes decisions, read the underlying control / category text yourself.

Only two pair-types have moved to Class-2 so far. CWE↔ATT&CK and CWE↔CAPEC carry authority: "manual_QA_v2" with the direction-natural verbs. Other cross-category pairs — CWE↔STIG, ASVS↔CWE, CSF↔CWE, NIST↔OWASP-Web — still carry the legacy covers verb until a future re-author pass touches them. The v1 rows for those pairs are honest about scope (the coverage rating reflects the “covers” framing) but readers should know the verb dictionary will expand to them eventually.

A “full” rating in the broad→narrow direction is conditional. When a pair reads “broad covers narrow = full” and the reverse is partial or none, the chip is signalling the logical-containment case: a narrow item that sits inside a broader one. That relationship is useful, but it’s a dangerous one to read uncritically. The broad standard typically subsumes many specific requirements similar to the narrow one, and the full rating assumes the broad standard’s implementation actually reaches every detailed design or code-level vulnerability it could in principle address. In practice, that reach is hard to guarantee. A team meaningfully invested in a broad outcome can still ship instances of any individual narrow flaw because no broad program can exhaustively chase every specific weakness without dedicated mechanism. Read full in the broad→narrow direction as “this should be covered when the broader work reaches the relevant level of detail,” not as “this is automatically handled.”

The earlier convention conflict, for the record. Before the two-class taxonomy, the “covers” verb itself had a convention conflict: some author scripts stored (source=A, target=B, coverage=X) as “B covers A by X”; others, as “A covers B by X.” The LLM was internally consistent within each author run, but the convention varied between authors because the JSON field names (x_to_y) read as math arrows while the question wording asked the opposite. That was settled 2026-05-25: site-wide commit to the intuitive math reading, swap of affected rows, rename of JSON fields to a_covers_b. The two-class taxonomy 2026-06-03 sits on top of that foundation — covers still works the same way; it’s just no longer the universal verb.

Three further limitations:

Authoritative is not the same as bidirectional. The official pairs are published one-way; the reverse-direction view is our own two-way authoring, not the publishing authority’s. Don’t read inbound authoritative edges as a coverage score, and treat the Authoritative only toggle as the canonical record of what the authority actually said.
Coverage gap on edges we haven’t authored yet. The LLM authoring pass runs incrementally. A blank chip area doesn’t always mean “no relationship” — sometimes it means “we haven’t gotten to that pair yet.” The framework page’s publication date is the cut-off.
Asymmetric framework abstraction levels. CSF Govern subcategories (“establish risk strategy”) rarely map to specific CWEs or STIG rules. That’s expected: they’re policy outcomes, not implementation details. Absence is correct, not a gap.

How we keep this current

Framework releases (annual for OWASP, quarterly for STIGs, irregular for CSF/ASVS, ad hoc for ATLAS) trigger a re-ingest. For the third-party authoritative pairs that re-ingest is deterministic; the schema migration applied 2026-06-04 means the new rows land in the same nested source/target shape with the relation verb attached, so they show up in the chip renderer alongside everything else. For the LLM-authored pairs, new rows go through the same author-then-QA pipeline. When the underlying control text changes we either re-run the LLM for the affected rows or accept the previous mapping with a footnote.

TL;DR

A cross-walk row carries a verb (relation) and an extent (none / partial / mostly / full). Both directions are stored separately.
Same-category pairs use one verb both ways (Control↔Control = covers; Threat↔Threat = subsumes; etc.).
Cross-category pairs use two verbs — one per direction (CWE→ATT&CK is enables/exploits; Control→Threat is mitigates/is_mitigated_by; CVE→Threat is is_susceptible_to/exploits).
The 2026-06 re-author of CWE↔ATT&CK and CWE↔CAPEC moved 70% of ratings under the new verbs. Most of that came from rejecting thematic-association partials and lifting narrow-CAPEC-of-broad-CWE pairs to full on the enables direction.
ATLAS’s 170 techniques split 107/31/32 on AI-specificity; the 32 ATT&CK re-skins won’t get their own ATLAS cross-walks.
The official cross-walks are one-directional; only the published direction carries the authority. For CSF↔800-53 and OWASP-Web↔CWE we author our own two-way mapping, default to it, and offer an Authoritative only toggle — with a ≠ badge where ours conflicts with or adds to the authority’s forward edges.
Absence of a chip still means “no meaningful relationship,” not “mapping incomplete” — unless the framework pair is on the backlog.

Last updated 07 June 2026. Canonical taxonomy doc: docs/xwalk-taxonomy.md. Live counts in controls_xwalks: 11,775 total rows — 4,483 manual_QA + 2,640 manual_QA_v2 + 4,652 DISA_CCI; relation distribution: covers 9,135, enables 1,320, exploits 1,320.