Reading cross-walks: how the framework chips work
Vulnerability frameworks talk past each other. OWASP describes risk categories; NIST 800-53 describes controls; CSF describes outcomes; CWE describes weaknesses; ATT&CK describes adversary techniques; DISA STIGs describe per-OS configuration rules. They’re useful for different audiences — but the moment you need to answer “does this NIST control address OWASP A01?” or “which weaknesses does T1190 exploit?” you need a cross-walk. This page explains where the chips you see on framework pages come from, how to read them, and what changed in the most recent overhaul of the verb dictionary behind them.
What a cross-walk is
A cross-walk row says “entry A in framework X relates to entry B in framework Y”, with a verb that names the kind of relationship and an extent qualifier: partial, mostly, or full. We don’t write rows for the absence — if you don’t see a chip, the LLM and the reviewer both think there’s no meaningful connection.
Some mappings are authoritative
A few framework pairs come with official cross-walks that we ingest verbatim:
- NIST CSF 2.0 → 800-53 r5. NIST publishes the mapping in the Cybersecurity & Privacy Reference Tool. 738 directed edges across the 106 CSF subcategories.
- DISA STIG → NIST 800-53. Each STIG rule references one or more CCIs (Common Control Identifiers). DISA publishes the CCI→800-53 mapping; we compose rule→CCI→800-53 deterministically. 4,652 edges, no LLM.
- CWE ↔ CAPEC and CWE ↔ ATT&CK
candidate sets. MITRE publishes the
Related_Attack_Patternsbridge in the CWE catalog; we walk it to seed the candidate set our authoring pass rates. - OWASP Top 10 for Web 2025 → CWE. Each OWASP category is a CWE meta-category in MITRE’s catalog; membership is published by MITRE.
These show up as chips with no “reviewed by us” caveat — they’re as reliable as the publishing authority in the direction it publishes. And that direction is the catch: every one of them is one-way (note the arrows above). The reverse direction is a different matter — see the next section.
One-way authority, two-way reality
The authoritative mappings only ever assert the forward edge. NIST says CSF → 800-53; MITRE says OWASP-Web category → member CWEs. Neither publishes the reverse, yet a control page needs to answer “which outcomes point at me?” and a weakness page needs “which categories cover me?” Inverting a one-way mapping to fake that reverse is misleading: the set of subcategories that happen to cite a control is a by-product of the forward mapping, not a curated statement that the control supports those outcomes.
So for these pairs we now author our own two-way mapping — an LLM rates both directions with an extent, and the rows go through the same QA queue as every other interpretive pair. On a control page (CSF ↔ 800-53) and a CWE page (OWASP-Web ↔ CWE) you get a small toggle:
- Our mapping (default) — our two-way reading, including edges the authority never listed.
- Authoritative only — just the official one-way edges, nothing we added.
Because we rate every authoritative edge as well as our own, we can
show where the two disagree in the direction the authority
actually publishes. A chip is badged ≠
when it is either a conflict (the authority maps it
but our reviewer found no meaningful link) or an
addition (we map it but the authority doesn’t).
Pure differences of degree — we say mostly where the
authority simply says “mapped” — are not flagged,
because the authority carries no extent to disagree with. The point
isn’t that the authority is wrong; it’s that
“authoritative” and “bidirectional” are not the
same thing, and the badge keeps the two honest.
Each two-way mapping ships as a release with its own reliability and abstraction measures, a diff against the authoritative original (where one exists), and downloadable raw data (JSON / CSV / XLSX). See all cross-walk releases.
The rest needs interpretation
Most useful cross-walks don’t exist as official data: OWASP↔NIST, ASVS↔CWE, ASVS↔NIST, CSF↔CWE, CWE↔STIG, CWE↔ATT&CK extents. The relationships are real and load-bearing for practitioners, but no organisation publishes them. We fill the gap with a two-pass process:
- LLM authoring. For each source entry, we narrow the target framework to the top 8–10 most-likely candidates by family seed + text overlap (and, for CWE↔ATT&CK, by the MITRE-published bridge), then ask a frontier model to evaluate each pair in both directions and return an extent rating + a one-sentence rationale.
- Human QA. Every LLM-authored row sits in
a review queue under
authority: "llm_unverified"until a person promotes it tomanual_QA. Confident rows (mostly/full in both directions) can be batch-promoted; the rest get per-pair review.
Why “covers” stopped being enough
For most of the project, the canonical verb was a single word — “covers”. The convention read: “A covers B by X” means “if you implement A, X of B is addressed.” That works cleanly when both endpoints are controls: a narrow ASVS check covers a broad NIST control partially; the broad NIST control covers the narrow ASVS check fully. Asymmetry comes from breadth-of-scope.
It breaks the moment the two endpoints aren’t the same kind of thing. Try saying out loud: “T1190 covers CVE-2026-1234.” It doesn’t parse. The right sentence is “T1190 exploits CVE-2026-1234,” or from the other direction “CVE-2026-1234 is susceptible to T1190.” The verb is direction-natural; one word doesn’t do both sides.
The author-prompt history showed the same friction. CWE↔ATT&CK rows were getting authored under a prompt that asked the LLM to rate “covers” in both directions, and the model had to twist the verb into “the technique exploits one slice of the weakness’s exploitation surface, so attack covers cwe = partial.” That worked, but it leaned on a particular reading of “covers” that drifted between author runs.
Two classes of mapping
We rewrote the convention into two classes (settled
2026-06-03; canonical doc at
docs/xwalk-taxonomy.md in the repo):
- Class 1 — same-category, shared verb.
Both endpoints are the same kind. One verb works both ways;
asymmetry comes from breadth-of-scope. Examples:
Control↔Control uses
covers; Threat↔Threat usessubsumes/subsumed_by; Vuln-class↔Vuln-class usesgeneralizes/specializes. - Class 2 — cross-category, two distinct
verbs. The two endpoints are different kinds. Each
direction has its own direction-natural verb and its own
independently-rated extent. The two extents can disagree,
because the questions are different. CWE→ATT&CK uses
enablesin the forward direction (“weakness enables technique”) andexploitsin the reverse (“technique exploits weakness”). Control→Threat usesmitigates/is_mitigated_by. CVE→Threat usesis_susceptible_to/exploits. CWE→Asset usesmanifests_in/is_prone_to.
The full verb dictionary has 14 entries. Every row in
controls_xwalks now carries a
relation field naming the verb the rating was
authored under, plus an extent field with the
none / partial / mostly / full rating.
A chip on a category page renders like this: AC-3 ←F →P The arrow marks read: ← the other side does the verb to this (full); → this does the verb to the other side (partial). The verb itself appears in the tooltip on hover, e.g. “AC-3 mitigates T1078 (partial); T1078 is mitigated by AC-3 (full).”
What the verb change actually moved
We re-authored every CWE↔ATT&CK and CWE↔CAPEC
pair under the new Class-2 prompts, then promoted the results to
authority: "manual_QA_v2" alongside the
preserved-but-archived v1 rows. The sweep covered 1,320 unique
pairs (343 CWE↔ATT&CK + 977 CWE↔CAPEC), 2,640
directed rows. Compared to the v1 ratings under
covers:
- 30% same (396 pairs) — verb change preserved the rating.
- 54% shifted (707 pairs) — same direction in extent space, different specific level.
- 16% flipped (217 pairs) — one or
both directions crossed the
noneboundary.
The flips were the most informative. Headline pattern:
CWE-200 (Information Exposure) versus ATT&CK
reconnaissance techniques — T1016, T1046, T1049,
T1083, T1135, T1595, T1111. Under covers, every
pair rated partial / partial: thematic adjacency
was being read as “CWE-200 covers T1083 partially,
T1083 covers CWE-200 partially.” Under
enables, the forward direction (does the weakness
enable the technique?) drops to none for every
reconnaissance technique in the cluster: T1083 (File and
Directory Discovery) succeeds via standard OS utilities
regardless of whether an info-exposure flaw exists; T1135
(Network Share Discovery) succeeds via protocol enumeration;
T1595 (Active Scanning) gathers infrastructure details without
needing CWE-200. The verb forced the harder question: is the
weakness necessary for the technique to land? Often the
honest answer is no. The reverse direction stays
partial — exploits asks how much
of the weakness’s surface the technique uses, and the
answer is “a slice.”
Headline pattern on the other pair: CWE→CAPEC
shifted up. Under covers, most CWE→CAPEC
rows read partial because the narrow CAPEC was
seen as capturing only a slice of the broad CWE’s surface.
Under enables, the question becomes “is the
weakness necessary for the attack pattern?” — and
for a specific CAPEC variant the answer is usually
full. CAPEC-209 (XSS via MIME-Type Mismatch) reads
“CWE-79 enables CAPEC-209 = full” (eliminate XSS,
the MIME trick has nowhere to land) and “CAPEC-209
exploits CWE-79 = partial” (one narrow variant, dozens of
others exist).
The ATLAS purity audit
Before pulling MITRE ATLAS into the new schema we audited it
for AI specificity. ATLAS’s 170 techniques cover an
AI-focused threat space, but a non-trivial share of the entries
are essentially ATT&CK techniques with the AML-T identifier
stamped on. The audit
(data/atlas_purity_audit.csv) scores each technique
on AI-keyword density (in title + description), generic-IT
keyword density, and whether the description cites an ATT&CK
parent technique. The split:
- 107 keep (63%) — clearly AI-specific.
- 31 scope-narrow (18%) — some AI content but cites ATT&CK parents; map only the AI-specific slice, defer the rest to ATT&CK.
- 32 drop (19%) — no real AI content; the entry is a rebranded general-IT attack.
Examples of the drops: AML.T0049 “Exploit Public-Facing
Application” is verbatim ATT&CK T1190 with the
AML-T id stamped on; AML.T0050 “Command and Scripting
Interpreter” is verbatim T1059; AML.T0072 “Reverse
Shell” describes generic reverse shell behaviour;
AML.T0048.* (External Harms / Financial / Reputational /
Societal / User) are outcome categories rather than attack
techniques. None of these will get ATLAS cross-walk rows; the
ATT&CK row already covers them. The
keep list goes through Class-2 authoring
against ATT&CK using subsumes /
subsumed_by (Threat↔Threat) so the AI-specific
framing stays distinguishable.
Honest caveats
These cross-walks are interpretive guidance, not legal or compliance certifications. The LLM is fast and consistent but not infallible; the human reviewer is slower but still capable of error. Treat chips as starting points for analysis, not as substitutes for it. For high-stakes decisions, read the underlying control / category text yourself.
Only two pair-types have moved to Class-2 so
far. CWE↔ATT&CK and CWE↔CAPEC carry
authority: "manual_QA_v2" with the
direction-natural verbs. Other cross-category pairs —
CWE↔STIG, ASVS↔CWE, CSF↔CWE,
NIST↔OWASP-Web — still carry the legacy
covers verb until a future re-author pass touches
them. The v1 rows for those pairs are honest about scope (the
coverage rating reflects the “covers” framing) but
readers should know the verb dictionary will expand to them
eventually.
A “full” rating in the broad→narrow
direction is conditional. When a pair reads “broad
covers narrow = full” and the reverse is partial or none,
the chip is signalling the logical-containment case: a narrow
item that sits inside a broader one. That relationship is
useful, but it’s a dangerous one to read uncritically. The
broad standard typically subsumes many specific requirements
similar to the narrow one, and the full rating
assumes the broad standard’s implementation actually
reaches every detailed design or code-level vulnerability it
could in principle address. In practice, that reach is hard to
guarantee. A team meaningfully invested in a broad outcome can
still ship instances of any individual narrow flaw because no
broad program can exhaustively chase every specific weakness
without dedicated mechanism. Read full in the
broad→narrow direction as “this should be covered
when the broader work reaches the relevant level of
detail,” not as “this is automatically handled.”
The earlier convention conflict, for the record.
Before the two-class taxonomy, the “covers” verb itself
had a convention conflict: some author scripts stored
(source=A, target=B, coverage=X) as “B covers A
by X”; others, as “A covers B by X.” The LLM was
internally consistent within each author run, but the convention
varied between authors because the JSON field names
(x_to_y) read as math arrows while the question
wording asked the opposite. That was settled 2026-05-25:
site-wide commit to the intuitive math reading, swap of affected
rows, rename of JSON fields to a_covers_b. The
two-class taxonomy 2026-06-03 sits on top of that foundation
— covers still works the same way; it’s
just no longer the universal verb.
Three further limitations:
- Authoritative is not the same as bidirectional. The official pairs are published one-way; the reverse-direction view is our own two-way authoring, not the publishing authority’s. Don’t read inbound authoritative edges as a coverage score, and treat the Authoritative only toggle as the canonical record of what the authority actually said.
- Coverage gap on edges we haven’t authored yet. The LLM authoring pass runs incrementally. A blank chip area doesn’t always mean “no relationship” — sometimes it means “we haven’t gotten to that pair yet.” The framework page’s publication date is the cut-off.
- Asymmetric framework abstraction levels. CSF Govern subcategories (“establish risk strategy”) rarely map to specific CWEs or STIG rules. That’s expected: they’re policy outcomes, not implementation details. Absence is correct, not a gap.
How we keep this current
Framework releases (annual for OWASP, quarterly for STIGs,
irregular for CSF/ASVS, ad hoc for ATLAS) trigger a re-ingest.
For the third-party authoritative pairs that re-ingest is
deterministic; the schema migration applied 2026-06-04 means
the new rows land in the same nested
source/target shape with the
relation verb attached, so they show up in the chip
renderer alongside everything else. For the LLM-authored pairs,
new rows go through the same author-then-QA pipeline. When the
underlying control text changes we either re-run the LLM for the
affected rows or accept the previous mapping with a footnote.
TL;DR
- A cross-walk row carries a verb
(
relation) and an extent (none / partial / mostly / full). Both directions are stored separately. - Same-category pairs use one verb both ways
(Control↔Control =
covers; Threat↔Threat =subsumes; etc.). - Cross-category pairs use two verbs — one per
direction (CWE→ATT&CK is
enables/exploits; Control→Threat ismitigates/is_mitigated_by; CVE→Threat isis_susceptible_to/exploits). - The 2026-06 re-author of CWE↔ATT&CK and
CWE↔CAPEC moved 70% of ratings under the new verbs.
Most of that came from rejecting thematic-association
partials and lifting narrow-CAPEC-of-broad-CWE pairs to
full on the
enablesdirection. - ATLAS’s 170 techniques split 107/31/32 on AI-specificity; the 32 ATT&CK re-skins won’t get their own ATLAS cross-walks.
- The official cross-walks are one-directional;
only the published direction carries the authority. For
CSF↔800-53 and OWASP-Web↔CWE we author our own two-way
mapping, default to it, and offer an Authoritative
only toggle — with a
≠badge where ours conflicts with or adds to the authority’s forward edges. - Absence of a chip still means “no meaningful relationship,” not “mapping incomplete” — unless the framework pair is on the backlog.
Last updated 07 June 2026. Canonical taxonomy doc:
docs/xwalk-taxonomy.md. Live counts in
controls_xwalks: 11,775 total rows —
4,483 manual_QA + 2,640 manual_QA_v2 + 4,652 DISA_CCI;
relation distribution: covers 9,135,
enables 1,320, exploits 1,320.