- Issue created by @prudloff
- 🇫🇷France prudloff Lille
Failing tests reveal there is a problem with this approach.
Xss::filter() is sometimes called on partial HTML (for example #prefix and #suffix keys) and because filterXssDataAttributes() uses the DOM API, it tries to correct the HTML and closes unclosed tags. - 🇫🇷France prudloff Lille
Instead of parsing the HTML separately, I moved this to Xss::attributes() which already parses attributes.
I reused the $skip_protocol_filtering condition in order to keep the change simple, I think we can consider that a non-URI attribute is an attribute that could contain HTML. This has the added benefit of sanitizing the alt and title attributes which are other common attack vectors.Some tests are still failing.
- 🇫🇷France prudloff Lille
Xss::filter() encodes some lone characters like &, <, >. This means that we are going to double-encode these characters. For example
data-title="Me & my friends"
would becomedata-title="Me &amp; my friends"
.
Note that this is already the case when using data attributes in CKE. - 🇫🇷France prudloff Lille
This means that we are going to double-encode these characters. For example data-title="Me & my friends" would become data-title="Me & my friends".
Note that this problem already exists in other places in core. For example when using
(string) new Attribute(['data-title' => 'Me & my friends']);
.