[3.x] Check if Html::decodeEntities is required and secure

Created on 29 November 2023, over 1 year ago
Updated 16 July 2024, 9 months ago

Problem/Motivation

Currently within the WatchdogMailer.php file, this line (98) of code is used:

$body = Html::decodeEntities(strip_tags($body));

This might be prone to possible security risks, as it reverts previous HTML sanitation efforts, and makes its content unsafe (e.g. usage of the <script> tag is allowed again)

Steps to reproduce

Proposed resolution

Check whether this line can be replaced by a different (safer) approach and if yes, do so.

Remaining tasks

User interface changes

API changes

Data model changes

πŸ“Œ Task
Status

Closed: works as designed

Version

3.0

Component

Code

Created by

πŸ‡©πŸ‡ͺGermany lrwebks Porta Westfalica

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @lrwebks
  • Assigned to cameron prince
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Thanks for the report @LRwebks.

    The change has been introduced here: #3246512: Strip HTML in error messages received via email β†’ by @cameron prince.

    This indeed looks risky to me, could you please explain the changes made @cameron prince?

    I think this is a security risk, see
    https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Component%21Util...

    Be careful when using this function, as it will revert previous sanitization efforts (<script> will become

    ).
    Could you please explain why you are sure this is safe? Thank you! :)
  • πŸ‡ΊπŸ‡ΈUnited States cameron prince

    I think the issue description is pretty clear. The emails are sent as plain text, not HTML. This was added so that contents of the emails are more legible and don't include HTML code or encoded entities.

    As the content isn't being rendered as HTML, I don't really get the security concern. If there is a better or more appropriate way, I'm all ears.

  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    @cameron prince thanks for your super fast feedback!

    The question is, if it might happen, that the mail is sent as HTML using contrib mail systems like Symfony mailer or others?

    The point you made definitely mitigates the risk a lot. Now I'm unsure if there's risk left and on the other hand if there isn't a more safe way to fix this then, for example by using

    1. https://www.php.net/manual/en/function.strip-tags.php
    2. https://api.drupal.org/api/drupal/core%21lib%21Drupal%21Core%21Mail%21Ma...

    instead?

    https://api.drupal.org/api/drupal/core%21core.api.php/function/hook_mail already uses MailFormatHelper::htmlToText at the end for example.

    So removing these characters is fine, but I think the decoding (which is the risky part) might not be needed?

  • πŸ‡ΊπŸ‡ΈUnited States cameron prince

    Maybe the functions need to be reversed to alleviate your concerns, i.e. decode entities before stripping tags so that any decoded items that become tags are then stripped, too.

  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Thanks, yes I'll have a deeper look and provide a MR when I have the time!

  • πŸ‡ΊπŸ‡ΈUnited States cameron prince

    It already uses strip_tags. This is what was added:

    $body = Html::decodeEntities(strip_tags($body->__toString()));
    
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Let's set this to minor, I think the risk is not existing or low. I'll test some cases asap.

  • Issue was unassigned.
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica
  • πŸ‡©πŸ‡ͺGermany Grevil
  • Status changed to Needs work 9 months ago
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Could someone please elaborate the output, WITH Html::decodeEntities() and WITHOUT?
    And post the result here for an example?

    I think the issue description is pretty clear. The emails are sent as plain text, not HTML. This was added so that contents of the emails are more legible and don't include HTML code or encoded entities.

    Well not if you're using an Email HTML system, e.g. Symfony Mailer, I guess?

  • Merge request !25Remove Html::decodeEntities() β†’ (Open) created by Anybody
  • Status changed to Needs review 9 months ago
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica
  • πŸ‡©πŸ‡ͺGermany lrwebks Porta Westfalica

    I have looked at the same Mail with and without Html::decodeEntities() using MailPit (which displays Plaintext instead of HTML). The Tokens are not affected by its abscence, however the HTML that we send for the error message and backtrace is now escaped and not really legible at times:

    ParseError: syntax error, unexpected token "do" in
    Drupal\Core\Extension\ModuleHandler->loadInclude() (line 8 of
    /var/www/html/web/modules/custom/radioactive_php/radioactive_php.install).
    #0 /var/www/html/web/core/lib/Drupal/Core/Extension/ModuleInstaller.php(465):
    Drupal\Core\Extension\ModuleHandler-&gt;loadInclude(&#039;radioactive_php&#039;,
    &#039;install&#039;)
    

    The same would obviously happen to tags and & and the like (e.g.: "<script>" => "&lt;script&gt;").

  • Status changed to Closed: works as designed 9 months ago
  • πŸ‡©πŸ‡ͺGermany Anybody Porta Westfalica

    Okay just had a look at this with LRWebks and we can close this "Works as designed".

    Drupal (and HTML mail modules) already care for email escaping, this comment explains the details (eventhough for Drupal 7 https://drupal.stackexchange.com/a/72509/47035)

    So we'd have double-escaping if we'd remove this call. We can rely on Drupal for the security here.

Production build 0.71.5 2024