Use a base16 hash for dynamic css/js asset names

Created on 5 March 2025, about 1 month ago

Problem/Motivation

The file names generated for the dynamic assets are generated from a SHA256 hash, however this hash unintentionally leads to a collision with other potentially malicious file names/URIs because the base64 encoded string contains characters from A-Z, and can lead to a false positive and the asset being unintentionally blocked.

Additional reference: https://github.com/t18d/nG-SetEnvIf/pull/10

Steps to reproduce

Proposed resolution

MD5 can't be used since it was intentionally removed for a reason.

We essentially just need to ensure that the generated hashes use only hexadecimal characters [0-9a-zA-Z].

Which can be acheived by encoding the raw SHA256 binary using base16 (hexadecimal) instead of base64:

echo bin2hex(hash('sha256', 'raw input', true));

Maintaining the uniqueness whilst keeping the file names a bit more predictable.

There should be no notable changes to the end user, as the existing assets should be cleared by a database update, leaving room to use the new hashes.

Remaining tasks

User interface changes

N/A

Introduced terminology

N/A

API changes

Introduce a new \Drupal\Component\Utility\Crypt::hmacBase16() method which'll be used for dynamic file names.

Data model changes

Release notes snippet

TBD

📌 Task
Status

Active

Version

11.0 🔥

Component

asset library system

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @codebymikey
  • 🇬🇧United Kingdom catch

    The original md5() issue is #723802: convert to sha-256 and hmac from md5 and sha1 . I strongly disagreed with it at the time and still do, annoying that we're still dealing with the fallout of that 14 years later.

    Since then, there is now xxHash support in PHP which is being used in a couple of places in core (including in the asset hash generation itself) but not everywhere consistently yet. We could base16 encode an xxhash and remove the fake cryptography altogether maybe? Would still want the hash salt included in there though I think to ensure the hashes aren't valid across different sites for otherwise identical library combinations.

    Was trying to think whether changing the hash generation will cause any problems after deployments but I don't think it will be any more of a problem than a version change on a commonly used library. 📌 Bring back the asset stale file threshold Active would help to smooth that out a bit more, but don't think it's blocking or anything.

  • The xxHash is an interesting one, I've created a quick 3v4L PHP snippet to showcase the outputs from the various hashing algorithms.

    It appears xxh64 already provides a fairly short 16 character hexadecimal output as it's a 64-bit hash, so that seems like the way forward. I'll try putting in a MR for the changes in due course and see if the change breaks anything.

  • @catch: You wrote the comment I had halfway written before I was distracted onto something else. Yes, we want these algos to be very fast. Moving to cryptographically-secure algos here makes no sense insomuch as I understand the use case.

  • 🇬🇧United Kingdom catch

    One extra question though - if Crypt::hasBase64() gets caught in slightly over-greedy nginx rules, then why don't user password reset hashes and similar?

  • then why don't user password reset hashes and similar?

    They probably do as well, but from my use case so far, user resets are rarely done, and not necessarily deterimental to the user experience enough since the user reset hash can be easily regenerated to give a different hash that won't match the questionable URI path, whereas asset hashes appears to always be the same even after a cache clear and can cause the rest of the site to be non-functional or appear broken.

    One question I've had is whether we even need to base64 encode the SHA256 hashes in the first place? It's already URL safe as hexadecimals, and base64-ing it doesn't appear to make it more secure than it currently is, it just provides a slightly shorter string.

  • Pipeline finished with Failed
    about 1 month ago
    Total: 101s
    #442720
  • Pipeline finished with Success
    about 1 month ago
    Total: 485s
    #442723
  • Pipeline finished with Failed
    28 days ago
    Total: 702s
    #445847
  • Pipeline finished with Failed
    28 days ago
    Total: 594s
    #445963
  • Pipeline finished with Failed
    28 days ago
    Total: 1121s
    #446467
  • Pipeline finished with Failed
    23 days ago
    Total: 952s
    #449677
Production build 0.71.5 2024