- Issue created by @mcdruid
- Status changed to Needs review
5 months ago 4:05pm 12 September 2024 - π¬π§United Kingdom mcdruid π¬π§πͺπΊ
First pass at a backport.
Manually running tests for dblog and syslog seems fine (with PHP 7.4)
I've not looked at older (or newer) PHP yet.
We should also add new tests which try to pass invalid placeholders to
watchdog()
. - π¬π§United Kingdom mcdruid π¬π§πͺπΊ
https://www.php.net/manual/en/class.stringable.php was only added in PHP 8 but as far as I can see it doesn't cause a problem to use it like this:
if (is_scalar($variable) || is_null($variable) || $variable instanceof \Stringable) { $validated[$key] = $variable; }
In PHP < 8 the previous two checks should suffice in almost all cases for a D7 site.. I'd assume.
However, do we want to have an opt-out variable or similar for this validation, in case it causes problems for some sites?
The other thing that may cause a change in behaviour is validating that the placeholder keys actually follow the spec and begin with one of the three valid characters.
I'd think that's much more likely than the check to ensure that the variable is "string-like" to change the way some modules / sites logging behaves.
Perhaps an opt out flag would be prudent.
- πΈπ°Slovakia poker10
Thanks for working on this!
Besides the questions in #4, I should we probably move the
drupal_validate_watchdog_variables()
check todblog_watchdog()
andsyslog_watchdog()
only? In D8+, theLogMessageParser::parseMessagePlaceholders()
is called only forDbLog
andSyslog
classes, not for all otherLoggerInterface
implementations. If we keep the sanitization in the generalwatchdog()
function, then these variables will be removed even for contribhook_watchdog()
calls, which can have a potential to cause some issues probably.What do you think?
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Seems like a good suggestion to move the validation to core's two implementations of
hook_watchdog
.For dblog I've put it both in the hook and the theme function, so that validation happens before a log entry is written to the db, and when an entry that's already in the db is parsed.
You could perhaps argue that just doing it once would be sufficient.
I've also added an opt-out variable; I'd vote not to add a stanza for this to default settings as I would hope that not many sites will need it.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Tweaked the test (which is in the system module) so that it enables both syslog and dblog.
Without the validation syslog hits errors when it tries to process non-string variables:
---- WatchdogTest ---- Status Group Filename Line Function -------------------------------------------------------------------------------- Pass Other system.test 3393 WatchdogTest->setUp() Enabled modules: dblog, syslog Exception Notice syslog.module 117 syslog_watchdog() Array to string conversion Exception Uncaught e syslog.module 117 strtr() Error: Object of class stdClass could not be converted to string in strtr() (line 117 of /var/www/html/modules/syslog/syslog.module).
- Status changed to Needs work
5 months ago 10:02pm 15 September 2024 - πΈπ°Slovakia poker10
Thanks!
Currently the tests are failing - it is probably caused by the change in
dblog.admin.inc
, because correctly it should be validating$event->variables
, not$event->message
.However, maybe we do not need to add this validation in that file (on output), because we have committed this #2790857: Log completely unusable when an entry has corrupt serialized data (D7) β , so there is already a check if the variables can be unserialized. But we can test it, just for sure, if that covers all cases (and if the extra validation there is really not needed).
- Status changed to Needs review
5 months ago 8:53am 16 September 2024 - π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Ah, sorry I made that change in a bit of a rush.
I've removed the validation from
dblog_watchdog
as you could argue we should just write whatever's been logged to the db, and only sanitise it when we're trying to use the data (similar to the way that content filtering works). I think that's also the approach that was eventually taken in D10/11 (after quite a bit of discussion in the parent issue about whether to e.g. throw an exception when something weird is included in the original log message).I think it makes sense to limit the scope of this backport to ensuring that core's two logging modules don't throw errors if they encounter something they can't easily parse in a log message.
If we're doing that, I wonder if we even need the opt-out flag? I think it's less likely to be required than if we were validating data on its way into the db or even before
hook_watchdog
is invoked.there is already a check if the variables can be unserialized
This issue is slightly different to the case where the data cannot be unserialized; in this case it's perfectly possible to unserialize non-string data for the placeholders that have been stored.. but there will be errors if you then try to pass e.g. an array or object to a function like
strstr
ort
which expects a string it can swap into the message. - π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Okay the new test fails with that most recent version because the test calls watchdog with non-string data in the placeholders, then queries the db directly to verify that only the stringable variables were stored.
That's copied from dblog's tests in D11:
https://git.drupalcode.org/project/drupal/-/blob/11.x/core/modules/dblog...
So that must mean that dblog validates the placeholders before writing to them to the db.
Let's go back to doing that in the MR.
This is still specific to syslog and dblog so any other module implementing hook_watchdog can do whatever it likes with the variables; including using core's new
drupal_validate_watchdog_variables()
if that makes sense.I think if we're validating / filtering before writing to the watchdog table (in dblog) we should keep the opt-out.
Only remaining question from me is whether we need to filter / validate before writing to the db AND also when theming / parsing log messages on output.
I don't think doing so should be particularly costly in most cases, so I'd vote for belt-and-braces (i.e. do it in both places), especially as we're providing an opt-out.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Interesting; doing the validation before writing to the db broke the test for the other issue mentioned in #8
We can fix that using the opt-out within that test.. which maybe is okay, but goes against the principle of only testing one thing at a time (I think D7 probably breaks that principle all over the place).
I suspect the reason the test is failing is that the validation does this:
function drupal_validate_watchdog_variables($variables) { if (!is_array($variables)) { return array(); } ...snip...
...which would interfere with the way the test is trying to write
'BAD SERIALIZED DATA'
into the variables field when it callsdblog_watchdog
directly.D10/11 don't do exactly that validation step, but the placeholders do have an array typehint in the function signature of
parseMessagePlaceholders
.I suppose we could just return
$variables
instead of the empty array, but would we want to? I can't think of many legitimate use cases OTOH.