Cyber Resilience

CVE-2023-25558

HighRCE

Published: 11 February 2023

Published
11 February 2023
Modified
21 November 2024
KEV Added
Patch
CVSS Score v3.1 7.5 CVSS:3.1/AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
EPSS Score 0.0423 89.0th percentile
Risk Priority 18 60% EPSS · 20% KEV · 20% CVSS

Summary

CVE-2023-25558 is a high-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Datahub Project Datahub. Its CVSS base score is 7.5 (High).

Operationally, ranked in the top 11.0% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.

Deeper analysis

DataHub, an open-source metadata platform, contains a deserialization flaw in its frontend when SSO authentication is enabled via the pac4j library. The library processes id_token claims without sufficient validation; any claim whose value begins with the {#sb64} prefix is treated as a serialized Java object and deserialized. Although a RestrictedObjectInputStream restricts the allowed classes, the permitted packages remain broad enough to permit gadget chains that can result in remote code execution. The issue is tracked as CWE-502 and was assigned a CVSS 3.1 score of 7.5.

An attacker with a low-privileged account on an affected DataHub instance can supply a crafted id_token containing a malicious serialized payload during SSO login. Successful exploitation grants arbitrary code execution on the frontend server, potentially allowing full compromise of the metadata platform. The attack requires network access and involves high complexity due to the need to identify a usable gadget chain within the restricted class set.

The project’s security advisory and associated patches direct users to upgrade to a fixed release; the remediation commit is available in the DataHub repository. No workarounds are documented. The EPSS score reached a peak of 0.0796 before receding to its current value of 0.0423, indicating modest post-disclosure interest that has since declined.

EU & UK References

Vulnerability details

DataHub is an open-source metadata platform. When the DataHub frontend is configured to authenticate via SSO, it will leverage the pac4j library. The processing of the `id_token` is done in an unsafe manner which is not properly accounted for by…

more

the DataHub frontend. Specifically, if any of the id_token claims value start with the {#sb64} prefix, pac4j considers the value to be a serialized Java object and will deserialize it. This issue may lead to Remote Code Execution (RCE) in the worst case. Although a `RestrictedObjectInputStream` is in place, that puts some restriction on what classes can be deserialized, it still allows a broad range of java packages and potentially exploitable with different gadget chains. Users are advised to upgrade. There are no known workarounds. This vulnerability was discovered and reported by the GitHub Security lab and is tracked as GHSL-2022-086.

CWE(s)

Related Threats

No named actor attribution yet. ATT&CK technique mapping in progress for this CVE.

Affected Assets

datahub project
datahub
≤ 0.9.5

Mitigating Controls

Likely Mitigating Controls AI

Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.

addresses: CWE-502

Penetration testing supplies malicious serialized objects, detecting unsafe deserialization and supporting corrective actions.

addresses: CWE-502

Evaluation of untrusted data handling (deserialization testing) reveals unsafe processing, which the required remediation process addresses.

addresses: CWE-502

Untrusted serialized data can be deserialized and observed inside the chamber, blocking gadget-chain exploitation outside the sandbox.

addresses: CWE-502

Validates or rejects untrusted serialized data before deserialization occurs.

addresses: CWE-502

Identifies and blocks malicious code introduced through deserialization of untrusted data at system boundaries.

addresses: CWE-502

Integrity verification of serialized information can detect tampering before deserialization occurs.

addresses: CWE-502

Provenance of associated data allows detection of untrusted sources before deserialization or processing occurs.

References