DomPDF remote images blocked/broken due to missing user agent

Created on 26 August 2022, about 2 years ago
Updated 13 February 2023, almost 2 years ago

Problem/Motivation

When remote images are rendered within a PDF, DomPDF does not set a user agent string by default.

Our client has a CloudFlare rule that blocks incoming requests with no user agent for spam prevention purposes.

As a result, requests are blocked when DomPDF tries to load the images, and they appear broken inside the generated PDF.

Steps to reproduce

Proposed resolution

  1. Create a new field in the DomPDF plugin settings form allowing a user agent string to be defined
  2. If an override is configured, set the user agent in DomPDF via setHttpContext

Patch/merge request to follow.

User interface changes

An additional field on the DomPDF plugin configuration page.

✨ Feature request
Status

Needs review

Version

2.6

Component

Code

Created by

πŸ‡¬πŸ‡§United Kingdom jamiep

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ͺπŸ‡ΈSpain Sistemas SUMADOS

    Similar problem with OVH hosting. It is required the user-agent. All the files exists and are accesible vΓ­a browser but DomPDF can't access them.

    Drupal LOG:

    Error generating document: 
    Failed to generate PDF: 
    
    file_get_contents(https://<DOMAIN>/sites/default/files/css/css_Swr_hRAtW0jSHWbyenAiqMecy20dpgniBv9v4K6Y6zY.css): Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden , Unable to load css file https://<DOMAIN>/sites/default/files/css/css_Swr_hRAtW0jSHWbyenAiqMecy20dpgniBv9v4K6Y6zY.css,
    
    file_get_contents(https://<DOMAIN>/sites/default/files/pdf_imagenes/Cabecera_PDF_SC_SUMADOS.png): Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden , Image not found /sites/default/files/pdf_imagenes/Cabecera_PDF_SC_SUMADOS.png,
    
    file_get_contents(https://<DOMAIN>/system/files/webform/<IMAGE_PATH>): Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden , Image not found https://<DOMAIN>/system/files/webform/<IMAGE_PATH>

    Hosting LOG:

    CSS:

    [Wed Mar 08 16:23:37 2023] [error] [client <IP_ADDRES>] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/usr/local/apache2/conf/modsecurity/base_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "65"] [id "960009"] [rev "2.1.1"] [msg "Request Missing a User Agent Header"] [severity "NOTICE"] [tag "PROTOCOL_VIOLATION/MISSING_HEADER_UA"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "<DOMAIN>"] [uri "/sites/default/files/css/css_Swr_hRAtW0jSHWbyenAiqMecy20dpgniBv9v4K6Y6zY.css"] [unique_id "ZAioeYUtHj9div7cW2O1CgAAAEo"]

    IMAGE:

    [Wed Mar 08 16:23:37 2023] [error] [client <IP_ADDRES>] ModSecurity: Access denied with code 403 (phase 2). Operator EQ matched 0 at REQUEST_HEADERS. [file "/usr/local/apache2/conf/modsecurity/base_rules/modsecurity_crs_21_protocol_anomalies.conf"] [line "65"] [id "960009"] [rev "2.1.1"] [msg "Request Missing a User Agent Header"] [severity "NOTICE"] [tag "PROTOCOL_VIOLATION/MISSING_HEADER_UA"] [tag "WASCTC/WASC-21"] [tag "OWASP_TOP_10/A7"] [tag "PCI/6.5.10"] [hostname "<DOMAIN>"] [uri "/sites/default/files/pdf_imagenes/Cabecera_PDF_SC_SUMADOS.png"] [unique_id "ZAioeYUtHj9div7cW2O1CwAAAFQ"]

  • Status changed to Needs work over 1 year ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    left a comment on the MR

  • πŸ‡ΊπŸ‡ΈUnited States andysipple

    #3 MR worked great for me.
    Did need to put in a user-agent in the new fields https://www.whatismybrowser.com/guides/the-latest-user-agent/chrome

  • πŸ‡¬πŸ‡§United Kingdom jamiep

    Thanks for reviewing.

    I don't think [error] Session has not been set is related to the update hook though as the documentation for hook_post_update_NAME states that a return value can be string|null:

    string|null Optionally, hook_post_update_NAME() hooks may return a translated string that will be displayed to the user after the update has completed. If no message is returned, no message will be presented to the user.

    This issue appears to cover the same/similar session error: https://www.drupal.org/project/entity_print/issues/3383187 πŸ› Unexpected error with print engine PhpWkhtmlToPdf or DomPdf: Session has not been set Needs review

Production build 0.71.5 2024