D7 backport: Prevent the use of placeholders that cannot be converted into strings when creating logs

Created on 12 September 2024, 5 months ago
Updated 16 September 2024, 5 months ago

Problem/Motivation

D7 would benefit from a backport of https://www.drupal.org/project/drupal/issues/2481349 πŸ› Prevent the use of placeholders that cannot be converted into strings when creating logs Fixed

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

πŸ“Œ Task
Status

Needs review

Version

7.0 ⚰️

Component
Database LoggingΒ  β†’

Last updated 4 months ago

  • Maintained by
  • πŸ‡¦πŸ‡·Argentina @dagmar
Created by

πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @mcdruid
  • Merge request !9479initial backport β†’ (Open) created by mcdruid
  • Status changed to Needs review 5 months ago
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    First pass at a backport.

    Manually running tests for dblog and syslog seems fine (with PHP 7.4)

    I've not looked at older (or newer) PHP yet.

    We should also add new tests which try to pass invalid placeholders to watchdog().

  • Pipeline finished with Success
    5 months ago
    Total: 315s
    #281395
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    https://www.php.net/manual/en/class.stringable.php was only added in PHP 8 but as far as I can see it doesn't cause a problem to use it like this:

          if (is_scalar($variable) || is_null($variable) || $variable instanceof \Stringable) {
            $validated[$key] = $variable;
          }
    

    In PHP < 8 the previous two checks should suffice in almost all cases for a D7 site.. I'd assume.

    However, do we want to have an opt-out variable or similar for this validation, in case it causes problems for some sites?

    The other thing that may cause a change in behaviour is validating that the placeholder keys actually follow the spec and begin with one of the three valid characters.

    I'd think that's much more likely than the check to ensure that the variable is "string-like" to change the way some modules / sites logging behaves.

    Perhaps an opt out flag would be prudent.

  • πŸ‡ΈπŸ‡°Slovakia poker10

    Thanks for working on this!

    Besides the questions in #4, I should we probably move the drupal_validate_watchdog_variables() check to dblog_watchdog() and syslog_watchdog() only? In D8+, the LogMessageParser::parseMessagePlaceholders() is called only for DbLog and Syslog classes, not for all other LoggerInterface implementations. If we keep the sanitization in the general watchdog() function, then these variables will be removed even for contrib hook_watchdog() calls, which can have a potential to cause some issues probably.

    What do you think?

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Seems like a good suggestion to move the validation to core's two implementations of hook_watchdog.

    For dblog I've put it both in the hook and the theme function, so that validation happens before a log entry is written to the db, and when an entry that's already in the db is parsed.

    You could perhaps argue that just doing it once would be sufficient.

    I've also added an opt-out variable; I'd vote not to add a stanza for this to default settings as I would hope that not many sites will need it.

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Tweaked the test (which is in the system module) so that it enables both syslog and dblog.

    Without the validation syslog hits errors when it tries to process non-string variables:

    ---- WatchdogTest ----
    
    
    Status    Group      Filename          Line Function                            
    --------------------------------------------------------------------------------
    Pass      Other      system.test       3393 WatchdogTest->setUp()              
        Enabled modules: dblog, syslog
    Exception Notice     syslog.module      117 syslog_watchdog()                  
        Array to string conversion
    Exception Uncaught e syslog.module      117 strtr()                            
        Error: Object of class stdClass could not be converted to string in strtr()
        (line 117 of /var/www/html/modules/syslog/syslog.module).
    
  • Status changed to Needs work 5 months ago
  • πŸ‡ΈπŸ‡°Slovakia poker10

    Thanks!

    Currently the tests are failing - it is probably caused by the change in dblog.admin.inc, because correctly it should be validating $event->variables, not $event->message.

    However, maybe we do not need to add this validation in that file (on output), because we have committed this #2790857: Log completely unusable when an entry has corrupt serialized data (D7) β†’ , so there is already a check if the variables can be unserialized. But we can test it, just for sure, if that covers all cases (and if the extra validation there is really not needed).

  • Status changed to Needs review 5 months ago
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Ah, sorry I made that change in a bit of a rush.

    I've removed the validation from dblog_watchdog as you could argue we should just write whatever's been logged to the db, and only sanitise it when we're trying to use the data (similar to the way that content filtering works). I think that's also the approach that was eventually taken in D10/11 (after quite a bit of discussion in the parent issue about whether to e.g. throw an exception when something weird is included in the original log message).

    I think it makes sense to limit the scope of this backport to ensuring that core's two logging modules don't throw errors if they encounter something they can't easily parse in a log message.

    If we're doing that, I wonder if we even need the opt-out flag? I think it's less likely to be required than if we were validating data on its way into the db or even before hook_watchdog is invoked.

    there is already a check if the variables can be unserialized

    This issue is slightly different to the case where the data cannot be unserialized; in this case it's perfectly possible to unserialize non-string data for the placeholders that have been stored.. but there will be errors if you then try to pass e.g. an array or object to a function like strstr or t which expects a string it can swap into the message.

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Okay the new test fails with that most recent version because the test calls watchdog with non-string data in the placeholders, then queries the db directly to verify that only the stringable variables were stored.

    That's copied from dblog's tests in D11:

    https://git.drupalcode.org/project/drupal/-/blob/11.x/core/modules/dblog...

    So that must mean that dblog validates the placeholders before writing to them to the db.

    Let's go back to doing that in the MR.

    This is still specific to syslog and dblog so any other module implementing hook_watchdog can do whatever it likes with the variables; including using core's new drupal_validate_watchdog_variables() if that makes sense.

    I think if we're validating / filtering before writing to the watchdog table (in dblog) we should keep the opt-out.

    Only remaining question from me is whether we need to filter / validate before writing to the db AND also when theming / parsing log messages on output.

    I don't think doing so should be particularly costly in most cases, so I'd vote for belt-and-braces (i.e. do it in both places), especially as we're providing an opt-out.

  • Pipeline finished with Failed
    5 months ago
    Total: 381s
    #284600
  • Pipeline finished with Success
    5 months ago
    #284668
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Interesting; doing the validation before writing to the db broke the test for the other issue mentioned in #8

    We can fix that using the opt-out within that test.. which maybe is okay, but goes against the principle of only testing one thing at a time (I think D7 probably breaks that principle all over the place).

    I suspect the reason the test is failing is that the validation does this:

    function drupal_validate_watchdog_variables($variables) {
      if (!is_array($variables)) {
        return array();
      }
    
    ...snip...
    

    ...which would interfere with the way the test is trying to write 'BAD SERIALIZED DATA' into the variables field when it calls dblog_watchdog directly.

    D10/11 don't do exactly that validation step, but the placeholders do have an array typehint in the function signature of parseMessagePlaceholders.

    I suppose we could just return $variables instead of the empty array, but would we want to? I can't think of many legitimate use cases OTOH.

Production build 0.71.5 2024