CVE-2023-25558
Published: 11 February 2023
Summary
CVE-2023-25558 is a high-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Datahub Project Datahub. Its CVSS base score is 7.5 (High).
Operationally, ranked in the top 11.0% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.
Deeper analysis
DataHub, an open-source metadata platform, contains a deserialization flaw in its frontend when SSO authentication is enabled via the pac4j library. The library processes id_token claims without sufficient validation; any claim whose value begins with the {#sb64} prefix is treated as a serialized Java object and deserialized. Although a RestrictedObjectInputStream restricts the allowed classes, the permitted packages remain broad enough to permit gadget chains that can result in remote code execution. The issue is tracked as CWE-502 and was assigned a CVSS 3.1 score of 7.5.
An attacker with a low-privileged account on an affected DataHub instance can supply a crafted id_token containing a malicious serialized payload during SSO login. Successful exploitation grants arbitrary code execution on the frontend server, potentially allowing full compromise of the metadata platform. The attack requires network access and involves high complexity due to the need to identify a usable gadget chain within the restricted class set.
The project’s security advisory and associated patches direct users to upgrade to a fixed release; the remediation commit is available in the DataHub repository. No workarounds are documented. The EPSS score reached a peak of 0.0796 before receding to its current value of 0.0423, indicating modest post-disclosure interest that has since declined.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2023-29510
Vulnerability details
DataHub is an open-source metadata platform. When the DataHub frontend is configured to authenticate via SSO, it will leverage the pac4j library. The processing of the `id_token` is done in an unsafe manner which is not properly accounted for by…
more
the DataHub frontend. Specifically, if any of the id_token claims value start with the {#sb64} prefix, pac4j considers the value to be a serialized Java object and will deserialize it. This issue may lead to Remote Code Execution (RCE) in the worst case. Although a `RestrictedObjectInputStream` is in place, that puts some restriction on what classes can be deserialized, it still allows a broad range of java packages and potentially exploitable with different gadget chains. Users are advised to upgrade. There are no known workarounds. This vulnerability was discovered and reported by the GitHub Security lab and is tracked as GHSL-2022-086.
- CWE(s)
Related Threats
No named actor attribution yet. ATT&CK technique mapping in progress for this CVE.
Affected Assets
Mitigating Controls
Likely Mitigating Controls AI
Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.
Penetration testing supplies malicious serialized objects, detecting unsafe deserialization and supporting corrective actions.
Evaluation of untrusted data handling (deserialization testing) reveals unsafe processing, which the required remediation process addresses.
Untrusted serialized data can be deserialized and observed inside the chamber, blocking gadget-chain exploitation outside the sandbox.
Validates or rejects untrusted serialized data before deserialization occurs.
Identifies and blocks malicious code introduced through deserialization of untrusted data at system boundaries.
Integrity verification of serialized information can detect tampering before deserialization occurs.
Provenance of associated data allows detection of untrusted sources before deserialization or processing occurs.