CVE-2023-47248
Published: 09 November 2023
Summary
CVE-2023-47248 is a critical-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Apache Pyarrow. Its CVSS base score is 9.8 (Critical).
Operationally, ranked in the top 0.6% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog.
Deeper analysis
CVE-2023-47248 is a deserialization of untrusted data flaw (CWE-502) in the IPC and Parquet readers of PyArrow versions 0.14.0 through 14.0.0. The issue is triggered when an application reads Arrow IPC, Feather, or Parquet files from untrusted sources such as user-supplied input, and it is limited to the Python PyArrow implementation rather than other Apache Arrow language bindings.
An unauthenticated remote attacker can supply a crafted data file to any consuming application or library, resulting in arbitrary code execution on the target system. The vulnerability carries a CVSS 3.1 base score of 9.8, reflecting network attack vector, low complexity, and full impact on confidentiality, integrity, and availability.
Official guidance from the Apache Arrow project directs users to upgrade to PyArrow 14.0.1, with packages already published on PyPI; downstream projects are advised to pin this or later releases. When an immediate upgrade is not feasible, the separate pyarrow-hotfix package can be installed to neutralize the deserialization path on older versions.
The CVE exhibits a high exploitation probability, with EPSS values reaching a peak of 0.8801 and remaining near 0.8482, indicating sustained attacker interest after disclosure.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2023-0212
Vulnerability details
Deserialization of untrusted data in IPC and Parquet readers in PyArrow versions 0.14.0 to 14.0.0 allows arbitrary code execution. An application is vulnerable if it reads Arrow IPC, Feather or Parquet data from untrusted sources (for example user-supplied input files).…
more
This vulnerability only affects PyArrow, not other Apache Arrow implementations or bindings. It is recommended that users of PyArrow upgrade to 14.0.1. Similarly, it is recommended that downstream libraries upgrade their dependency requirements to PyArrow 14.0.1 or later. PyPI packages are already available, and we hope that conda-forge packages will be available soon. If it is not possible to upgrade, we provide a separate package `pyarrow-hotfix` that disables the vulnerability on older PyArrow versions. See https://pypi.org/project/pyarrow-hotfix/ for instructions.
- CWE(s)
Related Threats
No named actor attribution yet. ATT&CK technique mapping in progress for this CVE.
Affected Assets
Mitigating Controls
Likely Mitigating Controls AI
Per-CVE control mapping for this CVE has not run yet; the list below is derived from the weakness types (CWEs) cited in the NVD entry.
Penetration testing supplies malicious serialized objects, detecting unsafe deserialization and supporting corrective actions.
Evaluation of untrusted data handling (deserialization testing) reveals unsafe processing, which the required remediation process addresses.
Untrusted serialized data can be deserialized and observed inside the chamber, blocking gadget-chain exploitation outside the sandbox.
Validates or rejects untrusted serialized data before deserialization occurs.
Identifies and blocks malicious code introduced through deserialization of untrusted data at system boundaries.
Integrity verification of serialized information can detect tampering before deserialization occurs.
Provenance of associated data allows detection of untrusted sources before deserialization or processing occurs.