- Issue created by @mcdruid
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
All of these are available as optional settings. Are you saying we should default to safer settings, or enforce them?
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
I'd recommend that we enforce the removal (/ substitution) of the most dangerous characters.
For removal of spaces, that'd be ok as a default (would people really want to turn it off?!)
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
@kim.pepper although the existing options would remove all of the characters I highlighted as a concern, they do more than that.
I wouldn't advocate stripping filenames down to only ASCII / alpha-numeric characters (by default or as a mandatory setting) because of how restrictive that is for different languages.
I'm proposing a fairly limited number of forbidden characters - those which have significance for the shell - which would be a subset of what would be stripped by any of the existing options IIUC?
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Having said that, I went to look for a definitive list of "dangerous characters" and ended up with quite a lot of "it depends".
This is not definitive, but e.g. https://stackoverflow.com/questions/15783701/which-characters-need-to-be...
There are some quite good lists using e.g. the shell escaping option that printf (and apparently ls) offer - e.g.
No, character % does not need to be escaped No, character + does not need to be escaped No, character - does not need to be escaped No, character . does not need to be escaped No, character / does not need to be escaped No, character : does not need to be escaped No, character = does not need to be escaped No, character @ does not need to be escaped No, character _ does not need to be escaped Yes, character needs to be escaped Yes, character ! needs to be escaped Yes, character " needs to be escaped Yes, character # needs to be escaped Yes, character $ needs to be escaped Yes, character & needs to be escaped Yes, character ' needs to be escaped Yes, character ( needs to be escaped Yes, character ) needs to be escaped Yes, character * needs to be escaped Yes, character , needs to be escaped Yes, character ; needs to be escaped Yes, character < needs to be escaped Yes, character > needs to be escaped Yes, character ? needs to be escaped Yes, character [ needs to be escaped Yes, character \ needs to be escaped Yes, character ] needs to be escaped Yes, character ^ needs to be escaped Yes, character ` needs to be escaped Yes, character { needs to be escaped Yes, character | needs to be escaped Yes, character } needs to be escaped
I feel like most of those make sense; and there are a fair number of characters that would be allowed.
Whether this would be more or less confusing / user friendly than just stripping all punctuation though is perhaps a usability question.
Looks like the code mostly uses regex; I wonder whether the
punct
character class would be any use. It'd remove the "safe" characters from the list above which we may not want to do.To be clear what I'm looking for here is something that would remove all the dangerous shell characters but not limit users to strict ASCII alphanum without e.g. accented letters etc..
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
FWIW by default Wordpress removes a load of "special" characters from filenames:
https://github.com/WordPress/wordpress-develop/blob/6.7.2/src/wp-include...
function sanitize_file_name( $filename ) { ... $special_chars = array( '?', '[', ']', '/', '\\', '=', '<', '>', ':', ';', ',', "'", '"', '&', '$', '#', '*', '(', ')', '|', '~', '`', '!', '{', '}', '%', '+', 'β', 'Β«', 'Β»', 'β', 'β', chr( 0 ) ); ... $filename = str_replace( $special_chars, '', $filename );
Joomla seems to have decided to let the underlying OS decide what it will and will not allow in filenames :shrug:
- Merge request !11760[#3516706] Remove special chars from filename uploads β (Open) created by kim.pepper
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Here's a basic implementation. I haven't looked at any tests yet.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Thanks, that's a good start. Looks like you've borrowed the list of chars to strip from wordpress. Perhaps we could remove a few of the "safe" ones based on the list in #7; we're only talking about a few e.g.
% + = :
.FWIW here's how typo3 does this:
https://github.com/TYPO3-CMS/core/blob/v13.4.9/Classes/Resource/Driver/L...
The comments suggest that it's pretty strict:
... any character not matching [.a-zA-Z0-9_-] is substituted by '_'
...but in actual fact it's a bit more nuanced as there are some settings around allowing / remapping utf8 characters.
I'll try to do some work on this but am short on time just now.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
sorry, didn't mean to change status
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Removed config option, and updated tests now we replace special chars.
- πΊπΈUnited States smustgrave
Tests appear to be there so removing that tag.
Looking at the solution were all options taken?
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Looking at the solution were all options taken?
Do you mean are all the special characters included?
- πΊπΈUnited States smustgrave
Mean the proposed solution read like there were several approaches? Unless I read that wrong
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
I think we agreed on stripping special characters. It was more about which characters to strip. I have a feeling stripping whitespace might be contentious.
- πΊπΈUnited States smustgrave
Can whatβs not making it in be scratched from the proposed solutio
- πΊπΈUnited States cmlara
As a site owner, I'm going to miss Parentheses, Ampersand and Space., A little less Brackets, Pound Sign and Exclamation Point though I encounter them.
As a security engineer: This is not a bad hardening, however it is certainly not a fix for wherever the faults actually occur.
As someone who has used this in the past to perform sample RCE's against site setups; I can say yes this would make it much harder to exploit.
I also wish to reiterate that for these characters to cause issue someone has had to make some very serious mistakes down stream, mistakes that this change likely does not actually remove the vulnerabilities , just at most makes it significantly harder to exploit (move from Basic to Complex, saving 1 point on the Drupal Vulnerability Scoring ) and maybe move the Impacted Environment from All to Uncommon (2 points on the Drupal Score scale).
I agree with @smustgrave the 'excluded' ideas would be nice to have listed for historical purposes.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
TBH I don't have a strong opinion on which characters should be included or excluded. I'll happily defer to those with a stronger security background. I _do_ think removing whitespace will be disruptive to site owners.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Thanks - I'm going to tag this for a usability review (per https://www.drupal.org/docs/develop/issues/issue-procedures-and-etiquett... β ) as I agree that it's likely stripping spaces from filenames on uploads may not be universally welcomed.
It would, however, have benefits in terms of Security Hardening (it's very hard to achieve Command Injection without any spaces). (Acknowledging the comment in #20 that if a command injection vulnerability of similar exists, this is not the only fix that's needed.)
There are also other benefits like making it easier to iterate over batches of files in the shell etc.. but that's probably out of scope here.
I personally loathe spaces in filenames but I am probably not representative of a broad cross section of users.
So far I think there's some consensus that stripping "special characters" from filenames likely provides enough benefit (in terms of security) to outweigh annoyance some users may feel at having e.g. punctuation marks removed from their filenames, but stripping spaces may be seen as a step too far.
It could be a default that site owners could disable in the file handling options. I'd argue to make the special character stripping mandatory, but that's also open to debate.
Previous comments include some light research on how other CMSs/Frameworks do this. Wordpress and Typo3 both strip (or substitute) most punctuation by default, and it looks like they both swap out spaces for either an underscore or dash AFAICS.
Drupal already has an option to replace whitespace in filenames with _ or - but it's not enabled by default. I'd like to turn that on by default as part of this issue.
What does the UX-Team think about having mandatory stripping of special chars from filenames, plus enabling swapping spaces out for - or _ by default?
(I've not yet updated the IS to eliminate any proposed options as I don't think we have eliminated any yet.. agree we should do that when we get to that point though.)
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
I've posted some more details about my motivation for pursuing this change in a blog post about a recent Security Advisory:
https://www.mcdruid.co.uk/article/hacking-ai-module-drupal-cms
Hopefully the extra context this provides outweighs any whiff of self-promotion :)
To summarise the proposal briefly, the change would be:
- Filenames always have dangerous characters stripped (or replaced with an inert character like
-
). In practice this means most but not all punctuation (e.g. see list in #8). - Filenames have spaces replaced by
-
. This is existing optional functionality which would be enabled by default. - The replacement character used in both cases could be configurable (again, that functionality already exists).
This would be pretty similar to how filenames are sanitised by both WordPress and Typo3.
- Filenames always have dangerous characters stripped (or replaced with an inert character like
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Rebased on 11.x
I also switch the file install config to
replace_whitespace: true
. I'm not sure this is sufficient as a default or whether we want to go the extra step and enable it via an update hook. This will be disruptive as per #2 - Status changed to Needs review
about 2 months ago 10:50am 20 June 2025 - π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Thank you to the Usability Team who reviewed and discussed this issue in their meeting π Drupal Usability Meeting 2025-05-30 Active . I believe it's still on the todo list to comment here, but the meeting recording and transcript are linked to from that issue so we can glean details of their review in the meantime.
To attempt to summarise - the question was:
What does the UX-Team think about having mandatory stripping of special chars from filenames, plus enabling swapping spaces out for - or _ by default?
...and the answer:
benji: okay, there's a rather long list of special characters that are hard coded.
benji: ... You can't remove anything from that list. It's it's hard coded. The only thing you can change in configuration is whether to replace spaces.
benji: ... the main usability question is whether to allow people to override it.
benji: ... I think I agree that we should
benji: use the more secure default. So by default, replace spaces.
benji: Let let people override. And Ralph agrees.
benji: So we're unanimous on that.
Ralf Koller: [T]hat looks like a reasonable change, definitely.
I'll ask @benjifisher to correct me if that's not quite accurate.
Another important issue that was brought up was whether we're proposing to retrospectively tackle existing files.
I think that's a "no"; the likelihood of unintended consequences is really high if we tried to rename existing files, and we should limit the scope to future uploads only. (This is what Benji and the team correctly assumed).
So let's proceed on the basis that we want to:
- Introduce mandatory replacement / removal of "dangerous characters", and..
- Enable replacement of spaces in filenames by default, but allow this to be overridden.
Benji mentioned in the UX meeting that any overrides may have to happen only in settings.php as otherwise an attacker can potentially make those changes themselves (e.g. via XSS in the UI).
That's definitely a good point, but I think in the case of allowing spaces in filenames I'd be comfortable allowing that change to made in the UI if the dangerous characters are always removed with no way of overriding that behaviour.
- π¬π§United Kingdom mcdruid π¬π§πͺπΊ
Updates to IS now that we've agreed a proposed solution.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Rebased on 11.x and fixed tests. We had space ' ' in the list of special chars, but this is handled separately.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Looks like the test fail is an unrelated PHP 8.5 test.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Looks like it's a random fail π [random test failure] ScaffoldUpgradeTest and ScaffoldTest rely on packagist.org Active
- πΊπΈUnited States benjifisher Boston area
@mcdruid:
I am sorry for the delay in adding a comment. As you said in Comment #26, the Usability team looked at this issue on 2025-05-30. I just reviewed the discussion, and your summary in #26 looks correct.
For the record, the attendees at the usability meeting were benjifisher, rkoller, and simohell.
Thanks for clarifying that the change will not affect new uploads. In fact, the change to the whitespace default will only affect newly installed sites (or sites that newly install the
file
module).At the meeting, we had the impression that stripping punctuation was non-negotiable, so the only change we fully considered was the default handling of whitespace.
If you want more feedback from the usability team, a good way to reach out is in the #ux channel in Slack.