Hi Team
Drupal Version 9.5.10
I've had an email from our server guys highlighting a possible security issue.
It appears we've had a malicious bot attack which was blocked but the logs show the error below:
"NOTICE: PHP message: Uncaught PHP Exception Drupal\Core\Entity\EntityStorageException: "SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xC0\xA7\xC0\xA2%2...' for column 'value' at row 1: INSERT INTO "webform_submission_data" ("webform_id", "sid", "name", "property", "delta", "value") VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3,
My understanding is that the submission was supposed to fail, this is what the spambot was trying to do in order to force a hard PHP error, which it can then use for another, more specific attack.
The server team explained it like this:
Someone (bot) is POST'ing non-unicode probably just binary data to it. So in short form something like:
POST field = \xC0\xA7\xC0\xA2.... spam
... no validation ...
INSERT INTO webform_submission_data VALUES value = \xC0\xA7\xC0\xA2...
As \xC0 is not a valid style byte for a unicode value MySQL is rejecting the INSERT with an error which is bubbling up and being raised as an Drupal\Core\Entity\EntityStorageException.
The actual data prefixed with \xC0\xA7\xC0\xA2 seems to be just spam, badly double-encoded something or other who's purpose is unclear. It was sent by a malicious scanner though so I would assume it's used to trigger or indicate some kind of security issue somewhere. I doubt in this instance it's actually unsafe, MySQL is quite correct to reject it. You generally shouldn't rely on database insert data validation though.
Any POST data to a form that generates a hard PHP error is a major security risk (likely due to type coercion bugs). It can lead to exploits and leaks from the database, even from something as basic as missing type handing.
Conclusion
Although this sounds like a Webform issue, I've already raised the issue and the reply was, "The Webform module uses core's ContentEntityBase API for managing submissions, and that's what would be responsible for the data submission. Drupal's standard operating logic is to accept all data and then filter it on display. As such, the database query should allow the data to be saved, so the fact that it failed to save suggests a bug somewhere."
Again, my understanding is that the query was supposed to fail to throw a hard PHP error. After some discussion we decided it was a "core issue to validate the encoding of every inserted string. It's a big "ask", it's a security hardening and it needs input from the community."
Any advice of how I'd add an extra layer of validation/sanitisation to the core's ContentEntityBase API?