Incorrect Cache-Control headers for authenticated users

Created on 25 April 2020, almost 5 years ago
Updated 19 January 2023, about 2 years ago

Problem / Motivation

When using the page cache, we will generate the following HTTP headers when we need to by-pass the cache:

Cache-Control: must-revalidate, no-cache, private

This appears to be the intended behavior per core/modules/page_cache/tests/src/Functional/PageCacheTest.php:

// Check that authenticated users bypass the cache.
$user = $this->drupalCreateUser();
$this->drupalLogin($user);
...
$this->assertEqual($this->drupalGetHeader('Cache-Control'), 'must-revalidate, no-cache, private', 'Cache-Control header was sent');

This is also what happens when you use the page cache kill switch: \Drupal::service('page_cache_kill_switch')->trigger();. You can also see this in action by inspecting the headers of https://dri.es/status.

The problem is that this is a bad way to tell a browser and any proxy cache not to cache a page:

  • must-revalidate means that the cache must not use the resource after it becomes stale. It needs to first revalidate the request with the origin server.
  • no-cache means the resource can be cached, but that it must be revalidated each time before using it. The name of the header is somewhat counter-intuitive.
  • private means that the resource can't be cached by proxies (e.g. Varnish, CDN), but that it is ok for a browser to cache the resource.

In other words, these headers are at odds with one another:

  1. no-cache allows caching by proxies but requires revalidation. This makes must-revalidate redundant.
  2. private disallows caching by proxies (only the browser can cache). This is at odds with no-cache which allows proxies to cache the resource, as long they always revalidate first.

Proposed Resolution

It is better to use the following header:

Cache-Control: no-store

no-store means that the resource can't be stored by any cache, including the browser's cache.

There appear to be two parts to the fix:

  1. I fixed the implementation of setCacheControlNoCache() and added a new function setCacheControlNoStore(). With my patch applied, setCacheControlNoCache() is no longer used in core, but given that it is a protected method, it might be used by contributed modules. I don't think we can remove it. An alternative solution is to change the implementation of setCacheControlNoCache(), and not introduce setCacheControlNoStore().
  2. private and no-cache are set at the same time because $response->headers->set() appends headers by default. This is fixed by setting the 3rd parameter to TRUE.

I attached a patch but didn't write or update any tests. The existing tests will fail.

Behavior Changes

When this fix is applied, what will change about how Drupal is cached? Is there a change to the browser's back button? Is that change true for all browsers?

🐛 Bug report
Status

Needs review

Version

10.1

Component
Request processing 

Last updated 4 days ago

No maintainer
Created by

🇧🇪Belgium Dries

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇺🇸United States weekbeforenext Asheville, NC

    Updated the patch to fix more test failures.

  • 🇺🇸United States weekbeforenext Asheville, NC
  • 🇺🇸United States weekbeforenext Asheville, NC

    Fixing test failures... again.

  • 🇺🇸United States weekbeforenext Asheville, NC
  • Status changed to Postponed about 2 years ago
  • 🇨🇭Switzerland znerol

    It is better to use the following header:

    Cache-Control: no-store

    I disagree for the reasons already stated in #2. The approach taken here is very probably overzealous and has the potential to badly affect UX.

    I propose to postpone this and then reevaluate after the Vary header has been fixed in 🐛 Using the back button after logging out shows you pages from the authenticated user's session Needs work .

  • 🇳🇱Netherlands johnv

    As I understand, the page_cache module is only intended for anonymous users, so moving to other component.

  • First commit to issue fork.
  • Pipeline finished with Failed
    11 months ago
    Total: 178s
    #135400
  • Pipeline finished with Success
    11 months ago
    Total: 629s
    #138348
  • 🇫🇮Finland sokru

    On Slack @catch suggested moving this to "request processing system" since this is not cache subsystem issue. Cleaned the tags based on that.

    The scope of this issue should be making sure the private information is not stored in browser disk. This will resolve the security scanner reports mention on #5 🐛 Incorrect Cache-Control headers for authenticated users Postponed . Changing the Cache-Control header from must-revalidate, no-cache into no-store fixes the 🐛 Using the back button after logging out shows you pages from the authenticated user's session Needs work on Firefox, Chrome, Edge, but not on Safari, see https://discussions.apple.com/thread/251817133. I'd suggest leaving that issue to solve issue with Safari.

  • 🇫🇮Finland sokru

    sokru changed the visibility of the branch 10.0.x to hidden.

  • Status changed to Needs review 11 months ago
  • 🇫🇮Finland sokru

    sokru changed the visibility of the branch 7.x to hidden.

  • 🇭🇺Hungary mxr576 Hungary

    but not on Safari, see https://discussions.apple.com/thread/251817133.

    hm, does Safari also ignores must-understand Cache control value? :O

  • 🇨🇭Switzerland znerol

    Serving responses with Cache-Control: no-store has been the source of wicked UX issues in the past. This needs careful manual evaluation of multiple scenarios across popular browsers. Some scenarios are described in this rather old atlassion blog post.

    One scenario based on Drupal core alone could use the contact form together with a misconfigured e-mail transport. This accurately simulates a situation on a production site where a mail server isn't reachable for a short time. In this case an error message is displayed after the submit button has been pressed. My reaction as a user of this site would be to press the back button in order to get back to the text I've just written, either to save it to a file or to retry the form submission. On a site with Cache-Control: no-store, I fear that the browser will render an empty contact form and that all text is gone.

    My hunch is that the Cache-Control: no-store response header should be used on specific routes, i.e., on pages which display sensitive / confidential data. Examples of this type of information would be social security number, personal health information or credit-card data. Forms which accept and pages which display this kind of data probably should supply their own Cache-Control: no-store header. And security audits will rightfully flag such pages if that isn't the case.

    Also please note that results of any automated reporting tools (including security scanners) need to be interpreted by people with knowledge in that particular field. Comment #5 gives no details about the type of application, the scope of the audit and the reason for the suggested change.

  • 🇫🇮Finland sokru

    hm, does Safari also ignores must-understand Cache control value? :O

    Yes, Safari does ignore must-understand with back-button, Safari behavior is similar to other browser if the "disable cache" is selected on developer toolbar.

    Serving responses with Cache-Control: no-store has been the source of wicked UX issues in the past. This needs careful manual evaluation of multiple scenarios across popular browsers. Some scenarios are described in this rather old atlassion blog post.

    It would be very beneficial to get these weird UX issues documented with repro steps. On core's /contact/feedback contact form I was not able to find any difference between no-store and must-revalidate, no-cache, using browsers back & forward buttons.

    My hunch is that the Cache-Control: no-store response header should be used on specific routes, i.e., on pages which display sensitive / confidential data

    Or we could make Drupal more secure by default using no-store and if people need better caching, they could change the response header. Many times even the unpublished node title could contain confidential data.

    But I acknowledge there are risks of making this change, so it might be best just to close this issue and introduce a contrib module with eventSubscriber using Symfony's HeaderBag to get a desired Cache-Control header.

  • Status changed to Needs work 9 months ago
  • 🇬🇧United Kingdom catch

    I think we need to manually test and document browser behaviour in the issue summary, marking needs work for that.

  • First commit to issue fork.
Production build 0.71.5 2024