Skip to main content

The Hidden Danger in PDFs: How Misconfigurations Can Expose Sensitive Data?

Illustration of The Hidden Danger in PDFs: How Misconfigurations Can Expose Sensitive Data?
Patryk Bogdan

Overview

A recent security audit revealed a critical vulnerability in the way WeasyPrint processes user-provided data for generating invoices in PDF format. The issue occurs because of insufficient input validation, allowing attackers to inject malicious HTML tags that are rendered within the generated PDF. This flaw opens the door to extracting sensitive files from the application’s infrastructure or querying remote resources, posing significant security risks.

Vulnerability Breakdown

The application allows user input, such as names or surnames, to be embedded directly into PDFs via the WeasyPrint engine. However, the lack of proper sanitization permits attackers to inject HTML tags into the input. For instance:

  • Tags like <b> or <h3> enable attackers to manipulate the formatting of PDF content.
  • Tags like <link> allow the inclusion of external files as PDF attachments.

Such crafted inputs exploit WeasyPrint’s default configuration, enabling unauthorized access to files on the server or external sources. This capability can be leveraged to extract sensitive system data and perform internal reconnaissance.

Real-World Exploitation

During testing, auditors demonstrated the severity of this vulnerability. Files retrieved included:

  • Payment operator tokens
  • Credentials for SMS and email gateways
  • PostgreSQL database access credentials
  • Hosting system access credentials
  • JWT encryption keys

Additionally, the vulnerability allowed querying of internal infrastructure from the server processing the PDFs, highlighting its potential for lateral movement within the application environment.

Finding Vulnerability

Now, let’s move on to the interesting part! To identify this vulnerability and provide proof of its existence, I sent the following:

Screenshot of malicious payload

The following response demonstrates the server’s acceptance of the malicious payload and initiation of PDF generation:

Screenshot of server response

Then I downloaded the PDF file generated by the malicious payload:

Screenshot of PDF download

Below is the extracted Python script (e.g., ex.py) used during the analysis:

Screenshot of Python script

Now I extract an attachment from a PDF document: Using the ex.py script, the attachment is extracted from the PDF:

Screenshot of script execution

After running the script, the following file appears in the directory:

Screenshot of extracted file

The contents of the extracted file are displayed, revealing very sensitive environment variables:

Screenshot of environment variables

This is not all! Then I downloaded second PDF file containing another malicious payload:

Screenshot of second PDF

As you can see the contents of the extracted /etc/passwd file are displayed, confirming unauthorized file access:

Screenshot of /etc/passwd contents

Root Cause

This vulnerability stems from the default configuration of WeasyPrint, which allows unrestricted access to local and external files. Without stringent input validation and output sanitization, the software effectively serves as a bridge for unauthorized data extraction.

To address this vulnerability, organizations should:

  1. Implement Input Validation and Sanitization User-generated data should be rigorously sanitized to strip out any HTML or script tags before being incorporated into documents.

  2. Restrict Resource Access Limit the software’s access to local and external files, allowing only resources essential for its operation.

  3. Environment Hardening a) Segregate sensitive configuration files across different machines. b) Adopt the principle of least privilege for processes involved in PDF generation.

Other Insights

Illustration of From SPI Sniffing to Keys: Extracting Clevis/BitLocker Secrets from TPM Traffic #HardwareHacking

From SPI Sniffing to Keys: Extracting Clevis/BitLocker Secrets from TPM Traffic #HardwareHacking

Mateusz Lewczak

In September 2024, a real-world penetration test was conducted to assess the security of a laptop using LUKS disk encryption on Linux, with Clevis facilitating automatic disk unlocking. The tested device relied on a TPM (Trusted Platform Module) to secure the decryption key used by Clevis. The focus of the test was to explore potential vulnerabilities to SPI Sniffing attacks.

READ article
Illustration of Symfony Profiler in Production – An Entry Point for Sensitive Data Leaks and Remote Code Execution

Symfony Profiler in Production – An Entry Point for Sensitive Data Leaks and Remote Code Execution

Jakub Żoczek

During a security audit, a web application using an outdated version of the Symfony framework was identified. The analysis revealed the presence of the Symfony Profiler tool, which is commonly used for debugging applications during development. The Profiler provides detailed information about the application's operation, which is useful for developers. However, in a production environment, its availability can lead to the disclosure of sensitive information and, in some cases, remote code execution on the server.

READ article
A professional cybersecurity consultant ready to assist with your inquiry.

Any questions?

Happy to get a call or email
and help!