Aggregated asset generation causes uncacheable assets

Created on 13 June 2024, 6 months ago
Updated 17 September 2024, 3 months ago

Problem/Motivation

Aggregated asset generation logic causes un-cacheable requests to css/jss assets. This is an issue for pages with cached html that may be linking to an old aggregated url causing lots of request to hit origin. It is also is a possible avenue for denial of service since uncached request will always hit origin.

Steps to reproduce

  1. Enable CSS/JS aggregation
  2. Load a page and let drupal generate aggregated css/js files.
  3. On browser network tab copy one of the aggregated CSS/JSS urls that Drupal generated, it should look something like files/css/css_cw9yM8X70Zs2GLF35wJCtMpiTHHIm47v2H8TlCp4tSc.css?delta=0&language=en&theme=yourtheme&include=saomehashedvalue
  4. Modify the request to use some random filenames for example files/css/css_someRandomFileName.css?delta=0&language=en&theme=yourtheme&include=saomehashedvalue
  5. Visit the modified path above and Drupal will respond with Cache-Control: private, no-store and will continue to do so

Proposed resolution

This seems to be caused by the changes introduced in https://www.drupal.org/node/3301716 specifically this part of the code
core/modules/system/src/Controller/AssetControllerBase.php Line 200-206

    if (hash_equals($generated_hash, $received_hash)) {
      $this->dumper->dumpToUri($data, $this->assetType, $uri);
    }
    return new Response($data, 200, [
      'Cache-control' => static::CACHE_CONTROL,
      'Content-Type' => $this->contentType,
    ]);

When the received URL hash (filename) doesn't match the drupal generated hash, drupal always responds with the aggregated file but with a no-store cache control. This seems to be because we want to "still serve aggregates that may be referenced in cached HTML." We should look over this use-case and adjust the logic to allow the assets to be cached.

🐛 Bug report
Status

Fixed

Version

10.3

Component
Asset library 

Last updated 1 day ago

No maintainer
Created by

🇺🇸United States hablat

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @hablat
  • 🇺🇸United States hablat
  • 🇺🇸United States hablat

    Added a patch that redirects the old asset url to the new expected url. Should then lead to serving a valid file from the asset directory. Not sure if this is the right approach, but working for us so far.

  • Open in Jenkins → Open on Drupal.org →
    Environment: PHP 8.1 & MySQL 5.7
    last update 6 months ago
    29,687 pass, 2 fail
  • Use merge requests please. Is there really no existing bug report for this in the asset library system? There were a few originating from the recent change.

  • 🇺🇸United States hablat

    Looks like there are some test scripts that still expect an private no-store for aggregated assets. May need to have them adjusted as well.

    I did look around and there were some issue potentially related, but did not find anything dealing with this. The closest is the related issue #3427916 which seems to be tackling this through .htaccess change.

    I'll work on getting MR with the test script adjustments as well

  • 🇺🇸United States hablat
  • Pipeline finished with Failed
    6 months ago
    Total: 190s
    #201123
  • 🇺🇸United States hablat

    Created an MR and hid the patch. Also changed to apply to the main 11.x branch

  • Pipeline finished with Success
    6 months ago
    Total: 511s
    #201134
  • Pipeline finished with Success
    6 months ago
    Total: 512s
    #201218
  • Pipeline finished with Canceled
    6 months ago
    Total: 308s
    #201254
  • Pipeline finished with Success
    6 months ago
    Total: 584s
    #201260
  • Status changed to Needs review 4 months ago
  • 🇬🇧United Kingdom catch

    This is a good idea, only one thing I'm concerned about:

    The main situation this logic is intended to support is within the first few days after a new code deployment. What happens if someone deploys some code that only changes e.g. a CSS library from a theme, then reverts it again?

    e.g.

    Original state: Hash A
    Deploy 1: Hash B.
    Deploy 2: Hash A.

    In this case after the first deployment, a CDN could cache redirects from hash A to hash B, and then after the second deployment, if someone gets that cached redirect, they'll hit the URL for hash B and get redirected to back to hash A in an infinite loop.

    This is much more of an edge case than the current issue, but should we set a shorter TTL on the response like 10 minutes or 1 hour?

  • 🇺🇸United States hablat

    We're not too keen on having Drupal dictate the cache headers/ttl. It's what got us in trouble the first time around. I think in that scenario it should be dealt on the CDN level. Configure CDN to have lower ttls on asset paths/redirects, expire cache after deployment, etc.

    The main issue for us is that if you gave any asset filename, that drupal was not expecting, it would respond with a 200 Cache-Control: private, no-store, essentially taking away the caching control from the upper layers of cache. This is an easily exploitable way to keep hitting origin.

  • Status changed to Needs work 4 months ago
  • 🇬🇧United Kingdom catch

    Configure CDN to have lower ttls on asset paths/redirects, expire cache after deployment, etc.

    Asset paths and redirects are two very different things. An asset file that actually exists can be cached more or less indefinitely.

  • 🇺🇸United States hablat

    Right, I wasn't implying they were the same. Just saying their cache TTLs can both be set on the CDN as long as Drupal doesn't set the precedent for the cache headers.

  • 🇺🇸United States hablat

    I did test for this on the browser, and was not able to get the redirect loop to happen. In the Scenario where you visit Hash A, where it was previously redirected it gave a 200 response instead of a 301. If you visit Hash B it gets redirected to Hash A, but also returns a 200 instead of the 301. I may be missing something, but if you can induce this locally with just browser cache please let me know the testing steps.

    I do believe that if the issue/scenario is induced by having another caching layer, like a CDN, then it should be dealt there.

    But if we do have to make Drupal dictate the TTL, what we can do is instead of dictating the TTL on the asset file. We can instead set a cache TTL for the redirect.

  • 🇬🇧United Kingdom catch

    We can instead set a cache TTL for the redirect.

    Yes this is what I suggested in the MR - the redirect can set a cache TTL.

  • 🇺🇸United States hablat

    Thanks, I've modified the MR to set a max-age for the redirect.

  • Pipeline finished with Success
    4 months ago
    Total: 2225s
    #262039
  • Status changed to RTBC 4 months ago
  • 🇬🇧United Kingdom catch

    Thanks that looks much better. 1 hour of caching should be plenty for the redirect to reduce the hits to PHP, but also means if the situation in #9 does arise, it would happen for less than an hour (because it would take at least two deployments to get into that state within an hour in the first place).

    RTBC for me.

  • 🇸🇪Sweden emek

    We have tried this update for our sites running Drupal 10.3.2 and it seems to solve our problem that not all JS-files were cached.

  • 🇬🇧United Kingdom catch

    @ernek if you're able to, please separately try 🐛 Logic error in Drupal's lazy load for asset aggregation Active - this should fix an issue where some aggregates are treated as invalid/uncacheable when they properly should be.

    The change here is still valid for aggregates that are properly stale.

    • nod_ committed 858b4271 on 10.3.x
      Issue #3454507 by hablat, catch: Aggregated asset generation causes...
    • nod_ committed 572b08c0 on 10.4.x
      Issue #3454507 by hablat, catch: Aggregated asset generation causes...
    • nod_ committed 64e70104 on 11.0.x
      Issue #3454507 by hablat, catch: Aggregated asset generation causes...
    • nod_ committed 4f37d8d6 on 11.x
      Issue #3454507 by hablat, catch: Aggregated asset generation causes...
  • Status changed to Fixed 4 months ago
  • 🇫🇷France nod_ Lille

    Committed and pushed 4f37d8d64c to 11.x and 64e70104b2 to 11.0.x and 572b08c076 to 10.4.x and 858b427184 to 10.3.x. Thanks!

  • 🇸🇪Sweden emek

    @catch I tried the patch for 10.3 in that issue, but it didn't seem to work for me to get the js-files to be cached.

  • 🇧🇬Bulgaria alexrayu

    After the change in 10.3.5 we are having errors "Refused to apply style from 'file_name.css' because its MIME type ('text/html') is not a supported stylesheet MIME type, and strict MIME checking is enabled." Closer inspection shows that there is a redirect in place, but the redirect's destination does not exist. Was currently unable to identify what causes this issue with some sites (production only). Checked caching, but found no relation. Will have to patch out this change for now.

  • 🇬🇧United Kingdom catch

    @alexrayu please try the change from 🐛 Logic error in Drupal's lazy load for asset aggregation Active which should hopefully eliminate the redirects added here except for the case of recently stale aggregates.

  • 🇧🇬Bulgaria alexrayu

    @catch Thanks, it helped.

  • Hello, we have issues with redirects in 10.3.5. The root cause seems to be #3467860 🐛 Logic error in Drupal's lazy load for asset aggregation Active . However, we also use hook_file_url_alter to modify assets URLs, so the redirect destination is broken for us. I created #3475639 🐛 AssetControllerBase does not use file_url_generator when redirecting Active .

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • 🇮🇹Italy plach Venezia

    Vaguely related issue: 🐛 Compressed aggregates not delivered on first request Needs work .

Production build 0.71.5 2024