CVE-2025-49837
Published: 15 July 2025
Summary
CVE-2025-49837 is a high-severity Deserialization of Untrusted Data (CWE-502) vulnerability in Rvc-Boss Gpt-Sovits-Webui. Its CVSS base score is 8.9 (High).
Operationally, exploitation aligns with the MITRE ATT&CK technique Exploit Public-Facing Application (T1190); ranked in the top 27.1% of CVEs by exploit likelihood; it is not currently listed in the CISA KEV catalog; a public proof-of-concept is referenced.
This vulnerability is AI-related — categorised as LLM Application Platforms; in the Supply Chain and Deployment risk domain.
The strongest mitigations our analysis identified are NIST 800-53 SI-10 (Information Input Validation) and SI-2 (Flaw Remediation).
Deeper analysis
CVE-2025-49837 is an unsafe deserialization vulnerability (CWE-502) affecting GPT-SoVITS-WebUI, an open-source voice conversion and text-to-speech web interface. The issue resides in the vr.py module's AudioPre class, where the model_choose parameter accepts unsanitized user input representing a model path. This input is passed to the uvr function, which instantiates AudioPre with the path (appending a .pth extension) and loads the file using torch.load. Versions up to 20250228v3 are vulnerable, with a CVSS v3.1 base score of 9.8 (AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H).
A remote, unauthenticated attacker can exploit this vulnerability over the network with low complexity and no user interaction by supplying a malicious model path. The torch.load deserialization of a crafted .pth file enables arbitrary code execution on the server, potentially leading to full system compromise with high impacts on confidentiality, integrity, and availability.
The GitHub Security Lab advisory (GHSL-2025-049_GHSL-2025-053) details the flaw with code references in vr.py and webui.py but confirms no patched versions were available at publication on 2025-07-15. Mitigation requires avoiding untrusted model paths and validating/sanitizing inputs before torch.load; users should monitor the GPT-SoVITS repository for fixes.
This vulnerability is notable in AI/ML contexts, as GPT-SoVITS leverages PyTorch for model handling in voice synthesis pipelines, highlighting deserialization risks in ML web UIs. No public exploitation in the wild is reported.
EU & UK References
- 🇪🇺 ENISA EUVD: EUVD-2025-21561
Vulnerability details
GPT-SoVITS-WebUI is a voice conversion and text-to-speech webUI. In versions 20250228v3 and prior, there is an unsafe deserialization vulnerability in vr.py AudioPre. The model_choose variable takes user input (e.g. a path to a model) and passes it to the uvr…
more
function. In uvr, a new instance of AudioPre class is created with the model_path attribute containing the aforementioned user input (here called locally model_name). Note that in this step the .pth extension is added to the path. In the AudioPre class, the user input, here called model_path, is used to load the model on that path with torch.load, which can lead to unsafe deserialization. At time of publication, no known patched versions are available.
- CWE(s)
AI Security AnalysisAI
- AI Category
- LLM Application Platforms
- Risk Domain
- Supply Chain and Deployment
- OWASP Top 10 for LLMs 2025
- None mapped
- Classification Reason
- Matched keywords: gpt
Related Threats
MITRE ATT&CK Enterprise TechniquesAI
Why these techniques?
The unsafe deserialization vulnerability (CWE-502) in GPT-SoVITS-WebUI's vr.py AudioPre uses torch.load on a user-controlled model path, enabling remote code execution by exploiting the public-facing web application.
CVEs Like This One
Affected Assets
Mitigating Controls
Mitigating Controls (NIST 800-53 r5) AI
Validates and sanitizes the user-supplied model_choose input to prevent path traversal or arbitrary paths leading to unsafe torch.load deserialization.
Requires timely remediation of the unsafe deserialization flaw in vr.py by applying patches, mitigations, or code fixes when available from the repository.
Mandates integrity checks such as hashes or signatures on .pth model files before torch.load to block tampered or malicious deserialized content.