- πΊπΈUnited States smustgrave
Closing as a duplicate of π XSS::filter and filter_xss can create malformed attributes when you would expect them to be stripped Fixed
Initially reported by @lauriii in π Upgrade filter system to HTML5 Fixed , HTML5 allows unescaped less-than and greater-than in HTML attributes, e.g.
<img src="llama.jpg" data-caption="<em>Loquacious llama!</em>" />
Xss::filter() does not handle this:
>>> use \Drupal\Component\Utility\Xss;
>>> Xss::filter('<img src="llama.jpg" data-caption="Loquacious llama!" />', ['img', 'em']);
=> "<img src="llama.jpg" data-caption="Loquacious llama!" />"
>>> Xss::filter('<img src="llama.jpg" data-caption="<em>Loquacious llama!</em>" />', ['img', 'em']);
=> "<img src="llama.jpg">Loquacious llama!</em>" />"
In other words when an attribute contains a tag (or even just a >
) the output is mangled, and part of the attribute value may end up in the HTML body instead.
Xss::filter() uses two regular expressions to try and extract tags from HTML:
<[^>]*(>|$) # a string that starts with a <, up until the > or the end of the string
This trivially matches anything that looks like a tag, but does not handle attributes that contain >
.
if (!preg_match('%^<\s*(/\s*)?([a-zA-Z0-9\-]+)\s*([^>]*)>?|(<!--.*?-->)$%', $string, $matches)) {
Similarly this seems unable to handle attributes that contain >
.
Determine whether regex is sufficient to filter HTML in this way: https://stackoverflow.com/a/1732454
Improve the regex to handle attributes that contain tag characters, or replace Xss::filter() with something more robust.
Closed: duplicate
10.1 β¨
Not all content is available!
It's likely this issue predates Contrib.social: some issue and comment data are missing.
Closing as a duplicate of π XSS::filter and filter_xss can create malformed attributes when you would expect them to be stripped Fixed