- Status changed to Active
3 months ago 4:11am 22 March 2024 - π³πΏNew Zealand quietone New Zealand
The code in question was first committed in
commit 2d0df351d704ba34d191831f7d4b8cb385555be2 Author: Dries Buytaert <dries@buytaert.net> Date: Tue Nov 29 20:17:10 2005 +0000 - Various fixes. Updated CHANGELOG.txt
The most recent change was in #2609928: Xss::attributes() mangles valid attribute names containing numbers β .
We (@RoSk0, @wiifm and myself) researched the issue to determine if underscores are allowed characters in the attribute name. This section of Extensible Markup Language (XML) 1.0 (Fifth Edition) states that they are allowed. Therefor, I am changing this to active.
@RoSk0 and @wiifm
- π¦πΊAustralia klonos 90% Melbourne, Australia - 10% Larissa, Greece
I understand that this issue here was raised to explicitly allow underscores in attribute names, however, I would like to point out the following from https://www.w3.org/TR/REC-xml/#NT-Name
- Attribute names should follow the
NameStartChar (NameChar)*
format NameChar
may contain dashes, dots, mid-dots, numbers (among other characters), but it does NOT explicitly state allowing underscores- also,
NameStartChar
(the first character of attribute names) must NOT be a dash - it can however be a colon (:
) or an underscore.
So unless I'm interpreting that wrong, the current regex is incorrect, as it allows the first character of attributes to be a dash. So if we want to allow underscores, they should only be allowed as the first character of the attribute name.
So unless we don't want to follow the standard strictly (in order to allow the use case for which this issue here was opened), the regex should be like so instead:
[_a-zA-Z][-a-zA-Z0-9]*
The above:
- allows the first letter of the attribute name to be an underscore, but NOT a dash
- allows any character following the first character to be a dash, but NOT an underscore
Not sure about how people feel about the rest of the characters (colons, dots, mid-dots etc.). They seem edge cases to me, and best left to be discussed in a follow-up issue, in order to avoid derailing this one here.
my 2c.
- Attribute names should follow the