NEWS Your Server Just "Displayed" Scanned Documents. How a Simple PDF Generator Can Give Away All of a Company's Secrets

ExcalibuR

Legend
LEGEND
PREMIUM
MEMBER
Joined
Jan 17, 2025
Messages
4,031
Reaction score
7,794
Deposit
11,800$
Your Server Just "Displayed" Scanned Documents. How a Simple PDF Generator Can Give Away All of a Company's Secrets
1767017787367.png
Researchers have discovered thirteen critical bugs in PDF generation libraries used by thousands of websites.​

Researchers from the PT Swarm team found thirteen vulnerabilities in popular libraries for generating PDF files. What could possibly go wrong with a simple HTML to PDF conversion? Quite a lot, it turns out.

Thousands of web services create PDF documents daily—invoices, contracts, reports. Developers treat this as a routine technical task, but in practice, this is where the trust boundary is often crossed. The renderer parses HTML, loads external resources, processes fonts, SVGs, and images, and sometimes has access to the network and the server's file system. Dangerous behavior can arise by default, without any explicit settings or warnings. This is enough for a seemingly harmless converter to turn into a tool for attacking server infrastructure.


AI-generated scan of a fictional passport used to demonstrate confidential data leakage
1767017822504.png

Specialists conducted a detailed analysis of seven popular libraries for PHP, JavaScript, and Java: TCPDF, html2pdf, jsPDF, mPDF, snappy, dompdf, and OpenPDF. In addition to the thirteen vulnerabilities, the team demonstrated seven instances of deliberate dangerous library behavior and identified six potential configuration errors. Among the discovered issues are unauthorized access to server files, unsafe data deserialization, server-side request forgery (SSRF), and denial of service.

PDF generation is widely used in e-commerce, fintech, logistics, and SaaS solutions. Such services are often deployed within the security perimeter, close to confidential data, where network restrictions are more relaxed. This means that even a minor bug in the renderer can escalate into a serious incident: the leakage of documents, secrets, or internal URLs.

The story of the TCPDF library—one of the most popular PHP libraries for creating PDFs—is particularly telling. In version 6.8.0, researchers found a vulnerability allowing access to arbitrary images on the server via path manipulations in SVG files. An attacker could embed a specially crafted SVG with an <image> tag and an xlink:href attribute containing a relative path like ../../../../../../tmp/user_files/user_1/private_image.png into the HTML. The library did not properly validate the path and obediently embedded the private image into the resulting PDF.


PDF file with private user images obtained via a path traversal vulnerability
1767017834592.png

The developer released a fix in version 6.8.1, adding a check for the ../ sequence in the path. However, experts quickly found a bypass—simply URL-encoding the characters (..%2f instead of ../) rendered the protection useless. The issue was that the library checked the path first and then decoded it using the urldecode function. A new patch had to be released in version 6.9.1 with an improved isRelativePath function that accounts for various encoding variants.

An even more serious problem was discovered in the same TCPDF library—a deserialization vulnerability. The TCPDF class contains a __destruct magic method that deletes temporary files from the $imagekeys array when the object is destroyed. If an attacker could pass a serialized string into the unserialize function, they gained the ability to delete arbitrary files from the server—they simply needed to specify the desired path in the imagekeys field. The vulnerability was fixed in version 6.9.3 by adding a check for the substring _tcpdf in the filename to verify the file belonged to the library.

For the spipu/html2pdf library, which internally uses TCPDF, a similar vulnerability was found, but exploitable via Phar archives. Phar is a special packaging format for PHP applications, and certain functions like file_exists, when processing a path like phar://..., automatically deserialize the archive's metadata. An attacker could upload a Phar archive disguised as an image to the server and then, via a <cert> tag with a src="phar:///path/to/archive.png" attribute, trigger deserialization and file deletion. This vulnerability is relevant for PHP versions below 8.0 and was fixed in html2pdf version 5.3.1.

A series of SSRF vulnerabilities were also found in html2pdf. The library allowed requests to internal services via <link>, <img> tags, and the CSS property background: url(). The developer added a SecurityInterface with a checkValidPath method to filter resources, but researchers discovered that in some scenarios, the request was sent before the check was called. The issues were fully resolved only in version 5.3.2.

The jsPDF library for JavaScript was found vulnerable to ReDoS attacks—denial of service via regular expressions. In version 3.0.0, a specially crafted string with repeating charset=s sequences caused the application to process data for over 39 seconds, fully loading a single CPU core. The vulnerability was assigned CVE-2025-29907 and fixed in version 3.0.1.


100% CPU load when exploiting the ReDoS vulnerability in the jsPDF library
1767017844495.png

However, while analyzing the fix, experts found another problem: incorrect data type conversion caused the loop position variable to become negative (-1711276032), meaning the loop effectively never reached its exit condition. Processing the string data:/,aaaaaaa took over five minutes. This vulnerability was assigned CVE-2025-57810 and fixed in version 3.0.2.

Some libraries do not officially consider the described behavior a vulnerability. The TCPDF developer stated that SSRF via the <img> tag is "outside the library's scope of responsibility." The authors of mPDF and OpenPDF explicitly state in their documentation that the library processes input data "as is," without validation, and that sanitization responsibility lies with the programmer. In practice, this means that with careless use, such tools can allow embedding local server files into PDFs or making requests to internal network services.

Configuration errors deserve special attention. In dompdf, enabling the isPhpEnabled option allows executing PHP code directly from HTML markup—essentially enabling remote arbitrary code execution. Simply passing HTML like <script type="text/php">shell_exec("id");</script> to the library will execute the command on the server. The isRemoteEnabled option permits loading external resources and opens the door to SSRF attacks, while an incorrectly configured setChroot("/") removes all restrictions on reading local files.

In mPDF, the allowAnnotationFiles option allows attaching arbitrary server files to the PDF via the <annotation> tag, including /etc/passwd. The showWatermarkImage option allows using any local file as a watermark. In the snappy library (a wrapper for wkhtmltopdf), the enable-local-file-access flag grants access to the local file system.

Developers are advised to promptly update the libraries they use, carefully study their documentation for dangerous settings, and always sanitize user input before generating PDFs. All vulnerabilities found were fixed by the manufacturers in 2025.
 
Top Bottom