Disallow dangerous filenames e.g. command injection characters

Created on 1 April 2025, about 2 months ago

Problem/Motivation

Following discussion with the Drupal Security Team, it was agreed that this could be handled in a public "security improvements" issue.

At present Drupal's file API allows filenames to be created which could be dangerous if they're not handled safely. This is not a directly exploitable vulnerability, but improvements could be made that would reduce the likelihood of filenames being used as part of a chained attack.

Command injection is a specific concern here.

https://owasp.org/www-community/attacks/Command_Injection

https://portswigger.net/web-security/os-command-injection

Steps to reproduce

In some cases, browsers will escape/encode certain characters in a normal file upload, but it may be possible to avoid that escaping using a tool like Burp Suite, or perhaps via web services (rest / jsonapi).

An example of a dangerous filename which I believe a normal file field will currently accept is:

foo";echo `whoami`; #.txt

Proposed resolution

One or more of:

* Add a transliteration option to remove specific characters that may be used for command injection e.g. " ; # |` and if possible ' &.
* Disallow spaces by default in filenames (makes it quite a lot harder to achieve meaningful command injection).
* Review defaults for filename transliteration to make command injection as hard as possible.

Remaining tasks

* Implement improvements.
* Add tests (e.g. in \Drupal\Tests\file\Functional\SaveUploadTest ).
* Ensure that improvements also apply to web services and other uses of the API if possible.

User interface changes

Default filename handling may change; most noticeable changes for users may include removal/substitution of commonly used characters such as spaces, apostrophes and ampersands.

Introduced terminology

n/a?

API changes

Changes to filename handling may represent an API change.

Data model changes

n/a?

Release notes snippet

tbc

πŸ“Œ Task
Status

Active

Version

11.0 πŸ”₯

Component

file system

Created by

πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

Live updates comments and jobs are added and updated live.
  • Security improvements

    It makes Drupal less vulnerable to abuse or misuse. Note, this is the preferred tag, though the Security tag has a large body of issues tagged to it. Do NOT publicly disclose security vulnerabilities; contact the security team instead. Anyone (whether security team or not) can apply this tag to security improvements that do not directly present a vulnerability e.g. hardening an API to add filtering to reduce a common mistake in contributed modules.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @mcdruid
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    All of these are available as optional settings. Are you saying we should default to safer settings, or enforce them?

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    I'd recommend that we enforce the removal (/ substitution) of the most dangerous characters.

    For removal of spaces, that'd be ok as a default (would people really want to turn it off?!)

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    @kim.pepper although the existing options would remove all of the characters I highlighted as a concern, they do more than that.

    I wouldn't advocate stripping filenames down to only ASCII / alpha-numeric characters (by default or as a mandatory setting) because of how restrictive that is for different languages.

    I'm proposing a fairly limited number of forbidden characters - those which have significance for the shell - which would be a subset of what would be stripped by any of the existing options IIUC?

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Having said that, I went to look for a definitive list of "dangerous characters" and ended up with quite a lot of "it depends".

    This is not definitive, but e.g. https://stackoverflow.com/questions/15783701/which-characters-need-to-be...

    There are some quite good lists using e.g. the shell escaping option that printf (and apparently ls) offer - e.g.

    No, character % does not need to be escaped
    No, character + does not need to be escaped
    No, character - does not need to be escaped
    No, character . does not need to be escaped
    No, character / does not need to be escaped
    No, character : does not need to be escaped
    No, character = does not need to be escaped
    No, character @ does not need to be escaped
    No, character _ does not need to be escaped
    Yes, character   needs to be escaped
    Yes, character ! needs to be escaped
    Yes, character " needs to be escaped
    Yes, character # needs to be escaped
    Yes, character $ needs to be escaped
    Yes, character & needs to be escaped
    Yes, character ' needs to be escaped
    Yes, character ( needs to be escaped
    Yes, character ) needs to be escaped
    Yes, character * needs to be escaped
    Yes, character , needs to be escaped
    Yes, character ; needs to be escaped
    Yes, character < needs to be escaped
    Yes, character > needs to be escaped
    Yes, character ? needs to be escaped
    Yes, character [ needs to be escaped
    Yes, character \ needs to be escaped
    Yes, character ] needs to be escaped
    Yes, character ^ needs to be escaped
    Yes, character ` needs to be escaped
    Yes, character { needs to be escaped
    Yes, character | needs to be escaped
    Yes, character } needs to be escaped
    

    I feel like most of those make sense; and there are a fair number of characters that would be allowed.

    Whether this would be more or less confusing / user friendly than just stripping all punctuation though is perhaps a usability question.

    Looks like the code mostly uses regex; I wonder whether the punct character class would be any use. It'd remove the "safe" characters from the list above which we may not want to do.

    To be clear what I'm looking for here is something that would remove all the dangerous shell characters but not limit users to strict ASCII alphanum without e.g. accented letters etc..

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    FWIW by default Wordpress removes a load of "special" characters from filenames:

    https://github.com/WordPress/wordpress-develop/blob/6.7.2/src/wp-include...

    function sanitize_file_name( $filename ) {
    
    ...
    
    	$special_chars = array( '?', '[', ']', '/', '\\', '=', '<', '>', ':', ';', ',', "'", '"', '&', '$', '#', '*', '(', ')', '|', '~', '`', '!', '{', '}', '%', '+', '’', 'Β«', 'Β»', '”', 'β€œ', chr( 0 ) );
    
    ...
    
    	$filename = str_replace( $special_chars, '', $filename );
    

    Joomla seems to have decided to let the underlying OS decide what it will and will not allow in filenames :shrug:

    https://github.com/joomla/joomla-cms/issues/33214

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Here's a basic implementation. I haven't looked at any tests yet.

  • Pipeline finished with Failed
    about 2 months ago
    Total: 552s
    #467850
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Thanks, that's a good start. Looks like you've borrowed the list of chars to strip from wordpress. Perhaps we could remove a few of the "safe" ones based on the list in #7; we're only talking about a few e.g. % + = :.

    FWIW here's how typo3 does this:

    https://github.com/TYPO3-CMS/core/blob/v13.4.9/Classes/Resource/Driver/L...

    The comments suggest that it's pretty strict:

    ... any character not matching [.a-zA-Z0-9_-] is substituted by '_'

    ...but in actual fact it's a bit more nuanced as there are some settings around allowing / remapping utf8 characters.

    I'll try to do some work on this but am short on time just now.

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    sorry, didn't mean to change status

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Removed config option, and updated tests now we replace special chars.

  • Pipeline finished with Failed
    about 2 months ago
    Total: 682s
    #468761
  • Pipeline finished with Failed
    about 2 months ago
    Total: 528s
    #468787
  • Pipeline finished with Success
    about 2 months ago
    Total: 929s
    #468799
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Tests appear to be there so removing that tag.

    Looking at the solution were all options taken?

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Looking at the solution were all options taken?

    Do you mean are all the special characters included?

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Mean the proposed solution read like there were several approaches? Unless I read that wrong

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    I think we agreed on stripping special characters. It was more about which characters to strip. I have a feeling stripping whitespace might be contentious.

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Can what’s not making it in be scratched from the proposed solutio

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Sorry to be a stickler

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    As a site owner, I'm going to miss Parentheses, Ampersand and Space., A little less Brackets, Pound Sign and Exclamation Point though I encounter them.

    As a security engineer: This is not a bad hardening, however it is certainly not a fix for wherever the faults actually occur.

    As someone who has used this in the past to perform sample RCE's against site setups; I can say yes this would make it much harder to exploit.

    I also wish to reiterate that for these characters to cause issue someone has had to make some very serious mistakes down stream, mistakes that this change likely does not actually remove the vulnerabilities , just at most makes it significantly harder to exploit (move from Basic to Complex, saving 1 point on the Drupal Vulnerability Scoring ) and maybe move the Impacted Environment from All to Uncommon (2 points on the Drupal Score scale).

    I agree with @smustgrave the 'excluded' ideas would be nice to have listed for historical purposes.

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    For the summary cleanup please

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    TBH I don't have a strong opinion on which characters should be included or excluded. I'll happily defer to those with a stronger security background. I _do_ think removing whitespace will be disruptive to site owners.

  • Pipeline finished with Failed
    22 days ago
    Total: 128s
    #492167
  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    Thanks - I'm going to tag this for a usability review (per https://www.drupal.org/docs/develop/issues/issue-procedures-and-etiquett... β†’ ) as I agree that it's likely stripping spaces from filenames on uploads may not be universally welcomed.

    It would, however, have benefits in terms of Security Hardening (it's very hard to achieve Command Injection without any spaces). (Acknowledging the comment in #20 that if a command injection vulnerability of similar exists, this is not the only fix that's needed.)

    There are also other benefits like making it easier to iterate over batches of files in the shell etc.. but that's probably out of scope here.

    I personally loathe spaces in filenames but I am probably not representative of a broad cross section of users.

    So far I think there's some consensus that stripping "special characters" from filenames likely provides enough benefit (in terms of security) to outweigh annoyance some users may feel at having e.g. punctuation marks removed from their filenames, but stripping spaces may be seen as a step too far.

    It could be a default that site owners could disable in the file handling options. I'd argue to make the special character stripping mandatory, but that's also open to debate.

    Previous comments include some light research on how other CMSs/Frameworks do this. Wordpress and Typo3 both strip (or substitute) most punctuation by default, and it looks like they both swap out spaces for either an underscore or dash AFAICS.

    Drupal already has an option to replace whitespace in filenames with _ or - but it's not enabled by default. I'd like to turn that on by default as part of this issue.

    What does the UX-Team think about having mandatory stripping of special chars from filenames, plus enabling swapping spaces out for - or _ by default?

    (I've not yet updated the IS to eliminate any proposed options as I don't think we have eliminated any yet.. agree we should do that when we get to that point though.)

  • πŸ‡¬πŸ‡§United Kingdom mcdruid πŸ‡¬πŸ‡§πŸ‡ͺπŸ‡Ί

    I've posted some more details about my motivation for pursuing this change in a blog post about a recent Security Advisory:

    https://www.mcdruid.co.uk/article/hacking-ai-module-drupal-cms

    Hopefully the extra context this provides outweighs any whiff of self-promotion :)

    To summarise the proposal briefly, the change would be:

    • Filenames always have dangerous characters stripped (or replaced with an inert character like -). In practice this means most but not all punctuation (e.g. see list in #8).
    • Filenames have spaces replaced by -. This is existing optional functionality which would be enabled by default.
    • The replacement character used in both cases could be configurable (again, that functionality already exists).

    This would be pretty similar to how filenames are sanitised by both WordPress and Typo3.

  • Pipeline finished with Failed
    12 days ago
    Total: 89s
    #500056
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Rebased on 11.x

    I also switch the file install config to replace_whitespace: true. I'm not sure this is sufficient as a default or whether we want to go the extra step and enable it via an update hook. This will be disruptive as per #2

  • Pipeline finished with Failed
    12 days ago
    Total: 181s
    #500058
Production build 0.71.5 2024