- Issue created by @mxr576
- Merge request !6080Resolve #3413396 "Xss fiter data uri support for images" โ (Open) created by mxr576
- Status changed to Needs review
11 months ago 10:55am 9 January 2024 - ๐ฌ๐งUnited Kingdom longwave UK
Always wary of changing this code but this looks like a sensible and self-contained change, however I added a question to the MR about applying the new condition.
- ๐ฌ๐งUnited Kingdom longwave UK
Out of scope for this, but I wonder if there is a way of reusing the Masterminds HTML5 parser in the XSS filter instead of having separate code that attempts to parse HTML attributes.
- Status changed to Needs work
11 months ago 1:36pm 9 January 2024 - ๐ฌ๐งUnited Kingdom longwave UK
Added lots more comments that I thought of, mostly around the regex, as these can be brittle and there might be some sneaky ways around it.
- ๐จ๐ฆCanada Charlie ChX Negyesi ๐Canada
@longwave Replacing this code with something more modern would surely be welcome. When I ported KSES to become filter_xss WhatWG was but a year old and HTML5 didn't exist for another two years yet. And as far as I can see, this code didn't change at all in the near two decades since.
- ๐จ๐ฆCanada dalin Guelph, ๐จ๐ฆ, ๐
Follow Mozilla's guidelines and support inline images with โdata:image/*โ unless itโs โdata:image/svg+xmlโ.
I recommend we go the other way and use a whitelist of known safe mimetypes. Who knows what the future brings. Or if I intentionally mangle the format to be something like "data:image/_svg" can I get the web browser to still render this as an svg?
- ๐ฌ๐งUnited Kingdom longwave UK
#8 seems like a good idea over having a regex that contains exceptions.
- ๐ญ๐บHungary mxr576 Hungary
Thanks for all the valuable feedback, everyone!
Or if I intentionally mangle the format to be something like "data:image/_svg" can I get the web browser to still render this as an svg?
Well, should not this be the concern of browsers rather than us? I used the Mozilla blog post as a source because I found no other, but if anybody knows what the current logic is that Firefox/Chrome/Safari or others use to decide safe data URI-s for images, please share.
@longwave Replacing this code with something more modern would surely be welcome. When I ported KSES to become filter_xss for Drupal 4.5.6/4.6.4 WhatWG was but a year old and HTML5 didn't exist for another two years yet. And as far as I can see, this code didn't change at all in the near two decades since.
Yeah.. as part of my investigation, I also concluded that this code is so old... and it made me wonder if and when we should adopt symfony/html-sanitizer, was that already considered? (Again, Drupal core issue search yielded no result.)
I recommend we go the other way and use a whitelist of known safe mimetypes.
Does anybody can come up with a sane default list? I generally like and prefer the whitelisting-based solution, but I wonder (again) if browsers are doing the same nowadays; based on this older Mozilla article, they are not.
- ๐ญ๐บHungary mxr576 Hungary
I recommend we go the other way and use a whitelist of known safe mimetypes.
Does anybody can come up with a sane default list? I generally like and prefer the whitelisting-based solution, but I wonder (again) if browsers are doing the same nowadays; based on this older Mozilla article, they are not.
Maybe this MDN doc could provide us with a sane default list?
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/img#supported_...If this is the direction we should take, should we allow overriding this list with a new parameter in
\Drupal\Component\Utility\Xss::filter()
? - ๐ญ๐บHungary mxr576 Hungary
Opened โจ Adopt Symfony's HTML Sanitizer Active as a potential follow up.
- ๐จ๐ฆCanada joseph.olstad
We're still using D10.2.x in a project and need to roll the MR for 10.2.x, I'll give it a go!
- ๐จ๐ฆCanada joseph.olstad
Before patching:
Broken image in rich_text filtered content when all attributes are allowed for the img element.
After patching and rebuilding caches:
Image is displayed as expected.
Thank you for this fix, it is an important one!
- ๐ฌ๐งUnited Kingdom alexpott ๐ช๐บ๐
One problem with the current MR is that it only accounts for
src="stuff"
and notsrc='stuff'
andsrc=stuffnospaces
which are what the two following blocks of code account for.Also the discussion seems to be to change the list to an allow list rather than and exception list but it got rtbc'd in the current state.
- ๐ฌ๐งUnited Kingdom alexpott ๐ช๐บ๐
Interestingly Symfony's sanitizer allows data on all media urls see https://github.com/symfony/html-sanitizer/blob/a25620fc6407e14331f3c0c56... and doesn't do anything special for SVGs. I tried inlining an SVG with JS using a data url and the JS is not executed.
- ๐บ๐ธUnited States tr Cascadia
Also the discussion seems to be to change the list to an allow list rather than and exception list but it got rtbc'd in the current state.
I assume you are referring to #11 by @dalin:
Follow Mozilla's guidelines and support inline images with โdata:image/*โ unless itโs โdata:image/svg+xmlโ.
I recommend we go the other way and use a whitelist of known safe mimetypes. Who knows what the future brings. Or if I intentionally mangle the format to be something like "data:image/_svg" can I get the web browser to still render this as an svg?
So let me summarize:
Right now, there are NO inline image formats supported. That is not acceptable. For things like email we NEED this. One of my modules, https://www.drupal.org/project/barcodes โ , also needs this.
Whitelists don't scale well. There are dozens of image formats, and more are being added all the time. If we use a whitelist we will always be out of date and there will constantly be feature requests to add new formats.
BUT, as a way forward, how about we add some common, known formats right now? To cover the most common use cases. Trying to come up with a comprehensive list is a futile exercise which will delay this capability and won't prevent us from getting all those feature requests for additional formats.
And if we're going to do a whitelist, perhaps it should be done in
UrlHelper::filterBadProtocol()
?