Page Cache must respect Vary, otherwise it might serve mismatched responses.

Created on 12 May 2018, over 6 years ago
Updated 11 October 2024, about 1 month ago

The Page Cache module caches whole responses for requests. It does not respect the Http Vary header though.

The Page Cache module calculates the CID as follows:

<?php
  /**
   * Gets the page cache ID for this request.
   *
   * @param \Symfony\Component\HttpFoundation\Request $request
   *   A request object.
   *
   * @return string
   *   The cache ID for this request.
   */
  protected function getCacheId(Request $request) {
    $cid_parts = [
      $request->getSchemeAndHttpHost() . $request->getRequestUri(),
      $request->getRequestFormat(),
    ];
    return implode(':', $cid_parts);
  }
?>

-> It does not include cache contexts, which is a critical thing to consider, and developers might oversee this very easily.

Example
Suppose you have a controller which sets cache contexts for HTTP request headers Origin (for cross-origin resource sharing). The controller legitimately adds 'Origin' to the 'Vary' response header entry. The Page Cache does not respect this, and thus a mismatched response might be returned. Browsers might then react with a blocked resource by the same origin principle.

How to avoid this problem

When writing custom controllers: One current workaround for this problem is to completely skip Page Cache by not creating CacheableResponseInterface responses. This is a huge minus though, because then you might have no cache at all for the backend.

Another way is to use Drupal's opt-in CORS support β†’ . This can be sufficient if the response header generation is kind of static and especially not conditionally. One use case I have found where this opt-in support is not sufficient, is when to implement a resource for serving Accelerated Mobile Pages, since these ones require a certain logic for generating an acceptable response header. Especially when having a site running on multiple domains and protocols, this might get really tough.

Proposed solution

Currently know of 3 options:

  1. Page Cache must automatically create a cache context on its own regards the Vary header entry, if it's something different than Cookie value or
  2. Page Cache must either include cache contexts during its CID calculation, at least a minimum of the available ones which affect anonymous request variation - especially HTTP request headers or
  3. Page Cache must skip (but not set Cache-Control to must-revalidate, no-cache etc.) caching for anonymous requests having other values than Cookie in the Vary entry of the response header. Or
  4. One other possibility could be to introduce an alternative to CacheableResponseInterface: CacheableProxyResponseInterface. When page_cache makes an instanceof CacheableProxyResponseInterface it would then bypass caching for such response. Other subscribers like the FinishResponseSubscriber would need some adjustment for that too then. This would enable developers to bypass page_cache but still delivering cacheable responses for their proxies.
πŸ› Bug report
Status

Active

Version

11.0 πŸ”₯

Component

page_cache.module

Created by

πŸ‡©πŸ‡ͺGermany mxh Offenburg

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States dpagini

    This is marked as a major bug with over 6 years of no updates. Should the tagging of this issue be changed at all?

    I actually also believe that this is a problem. I am sort of of the opinion that if page_cache cannot handle the Vary header, then maybe it simply should not cache pages that have that header set...? I don't think that the code sending the page, for example a controller, should need to change anything... that is likely sending the correct response already.

    I think I saw it's relatively recently introduced in core, and I haven't dove deep into it yet, but couldnt page cache also return the reason why something is not cacheable (just by page_cache). Here's the CR β†’ I saw. So maybe it reads X-Drupal-Cache: UNCACHEABLE (Vary cacheability) or something like that?

  • πŸ‡¨πŸ‡­Switzerland Elendev

    I've developed the module page_cache_vary β†’ until the issue is fixed in the core page_cache module.

    It can retrieve cache vary headers by caching them per URL, so that the cost of computing the vary headers is as minimal as possible.

    If this solution is good enough, I can try to integrate it directly in Drupal, let me know what you think.

Production build 0.71.5 2024