Pages generated via subrequest never get cached

Created on 4 March 2022, almost 3 years ago
Updated 24 October 2023, about 1 year ago

Problem/Motivation

I experienced this when looking at 404s and noticed they weren't caching in the proxy. The 404 has a surrogate-control on every request, and never actually gets a cache hit in the page_cache module and never gets the headers to get cached at the proxy.

Digging further it seems if you have a request that generates a subrequest (such as a 404 which sub requests to /system/404) then the sessionless render strategy will kick in on the sub request, and in onRespond of the sub request get wrapped as a BigPipeResponse. Then, when the final outer response hits onRespond, it will wrap it again. Then when it unwraps it later to inject into page_cache using getOriginalHtmlResponse, it is not a HtmlResponse, but a BigPipeResponse, so the DenyPipePipeSessionlessResponses kicks it and refuses to store it.

Thus, cache is completely killed on all 404 responses.

It is also worth noting, when the sub request completes it is effectively a 200 response. The DefaultExceptionHtmlSubscriber changes this to a 404 before responding in the main request. Therefore, it's not as simple as only wrapping in the sub request (where rendering takes place...) because the status code is 200, and that's what will be pushed into the cache (via getOriginalHtmlResponse) - as opposed to the modified version.

Steps to reproduce

1. Install bigpipe_sessionless and page_cache and test pages are caching
2. Generate a 404 and repeat, notice Surrogate-Control always present and never get a X-Drupal-Cache HIT

Proposed resolution

(Removed old proposed resolution as it caused 404 to become 200)

Likely this needs some reconsideration of how the original vs streamed pages are kept and maintained, and what that looks like at a main request level if the response was generated in a sub request like in the 404 route.

Remaining tasks

-

User interface changes

None

API changes

None

Data model changes

None

πŸ› Bug report
Status

Needs work

Version

2.0

Component

Code

Created by

πŸ‡¬πŸ‡§United Kingdom Driskell

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡«πŸ‡·France GaΓ«lG Lille, France

    GaΓ«lG β†’ made their first commit to this issue’s fork.

  • Status changed to Needs review over 1 year ago
  • πŸ‡«πŸ‡·France GaΓ«lG Lille, France

    I stumbled upon the same problem. Thank you for your time spent on this and the info sharing.

    Yes, we cannot wrap only the sub-request, as you explained. But we can wrap only the main request. It seems to fix the Drupal page cache problem.
    Of course, the problem of doing that is that the sub-response won't actually be cached. It seems acceptable to because I guess sub-requests only happen 403, 404,... What we want to stream are "real" pages.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.0.7 + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    11 pass, 2 fail
  • @ga%C3%ABlg opened merge request.
  • πŸ‡«πŸ‡·France GaΓ«lG Lille, France

    My previous change introduced a bug: some placeholders where not handled, resulting in missing html elements in the page (Drupal messages in my case).
    This was because we cannot assume that we can bypass HtmlResponseBigPipeSubscriber::onRespond() just as soon as it's a subrequest. HtmlResponseBigPipeSubscriber::onRespond() transforms the response into a BigPipeResponse if there are Big Pipe placeholders. If so, they need to be handled.
    So a better solution is to avoid the creation of big pipe placeholders in case we are on a subrequest, which I did (https://git.drupalcode.org/project/big_pipe_sessionless/-/merge_requests...). Then no need to bypass HtmlResponseBigPipeSubscriber::onRespond().

    It seems to work well (pages with subrequests are still cacheable and now Drupal messages appear).

    But Drupal placeholders is such a complex system that now I'm wondering if there could be placeholders into placeholders, which might result in less streamed parts than before? I don't think so and it wouldn't be such a problem, so I stop thinking about it!

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.0.7 + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    11 pass, 2 fail
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.0.7 + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    16 pass
  • πŸ‡§πŸ‡ͺBelgium wim leers Ghent πŸ‡§πŸ‡ͺπŸ‡ͺπŸ‡Ί

    Wim Leers β†’ made their first commit to this issue’s fork.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update about 1 year ago
    16 pass
  • Status changed to Needs work about 1 year ago
  • πŸ‡§πŸ‡ͺBelgium wim leers Ghent πŸ‡§πŸ‡ͺπŸ‡ͺπŸ‡Ί

    Thanks to both of you for your work on this issue, @Driskell & @GaΓ«lG! πŸ™

    Without test coverage, I cannot commit this unfortunately πŸ˜…

    The good news is that it should be relatively straightforward to add test coverage to the existing \Drupal\Tests\big_pipe_sessionless\Functional\BigPipeSessionlessTest::testBigPipeNoSession() method 😊 That happened most recently in πŸ› Pages are cached while the request policy was set to deny Fixed . Also, thanks to GitLab CI testing, the error output is much better and we can test on all supported core versions simultaneously! πŸŽ‰

    In fact, you already have a good chunk of what you need right in the issue summary:

    Generate a 404 and repeat, notice Surrogate-Control always present and never get a X-Drupal-Cache HIT

    I merged in upstream changes so that this MR will now also run the tests on GitLab CI πŸ‘

Production build 0.71.5 2024