Cache redirect error displayed when using 'view own unpublished content' permission alongside node grants

Created on 3 February 2025, about 2 months ago

Problem/Motivation

Since upgrading to 10.4.1 I have seen a user warning about overwriting cache redirects:

User warning: Trying to overwrite a cache redirect with one that has nothing in common, old one at address "languages:language_interface, theme, user.permissions, route" was pointing to "user", new one points to "user.node_grants:view". in Drupal\Core\Cache\VariationCache->set() (line 138 of /app/web/core/lib/Drupal/Core/Cache/VariationCache.php).

Debugging on 11.x I was able to reproduce it. It is coming from the local tasks for nodes under certain conditions after the entity status is changed.

It appears that when the user has the view own unpublished content permission and the node is not published then the node access handler returns access with the user.permissions context added by the checkViewAccess method.

When the node is published checkViewAccess returns NULL and so the parent call continues down to where $access_result = $this->grantStorage->access($node, $operation, $account); is called. At this point if there is an implementation of hook_node_grants then NodeGrantDatabaseStorage adds the user.node_grants:view context.

This difference seems to cause errors for variation cache works and when it renders the tabs after the status change produces the error.

> User warning: Trying to overwrite a cache redirect for "entity_view:block:stark_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=stark:[user.permissions]=863125a601a1ef531f97c3f043764864a62066d84f76547712f32db5e29fa5eb" with one that has nothing in common, old one at address "languages:language_interface, theme, user.permissions, route" was pointing to "user.roles:authenticated, user", new one points to "user.node_grants:view". in Drupal\Core\Cache\VariationCache->set() (line 138 of core/lib/Drupal/Core/Cache/VariationCache.php).

I have had no luck trying to reproduce displaying the error itself in an automated test but I can regularly reproduce it locally on minimal install of 11.x (which includes recent changes to 🐛 Access cacheability is not correct when "view own unpublished content" is in use Needs work ) and on 10.4 (which does not have the changes in 🐛 Access cacheability is not correct when "view own unpublished content" is in use Needs work .

Steps to reproduce

Setup:

  1. Standard install of 11.x
  2. Enable node_access_test_empty test module
  3. Create page content type - set default to not published
  4. Ensure local tasks block is placed on the page
  5. Create role with the following permissions: 'create page content', 'edit own page content', 'view own unpublished content'
  6. Create an user with the above role

Flow:

  1. As a user with the above role create a new page node (they must be the author)
  2. View the node and ensure the local tabs are rendered
  3. As admin or other user with appropriate permissions publish the node
  4. As the original node author, view the published node again and you get the cache redirect error message

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

🐛 Bug report
Status

Active

Version

11.0 🔥

Component

cache system

Created by

🇳🇿New Zealand ericgsmith

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @ericgsmith
  • 🇳🇿New Zealand ericgsmith

    ericgsmith changed the visibility of the branch 3504114-cache-redirect-error to hidden.

  • 🇳🇿New Zealand ericgsmith

    ericgsmith changed the visibility of the branch 3504114-cache-redirect-error to active.

  • Pipeline finished with Canceled
    about 2 months ago
    Total: 65s
    #413963
  • 🇳🇿New Zealand ericgsmith

    Thank you for moving the issue. Unfortunately I don't have the knowledge to say this is a node issue and not a cache issue?

    I have had trouble really digging into the variation cache as I think I can't find documentation on some of the basic terms like cache redirect - and I'm confused why the redirects which appear to relate to this warning don't contain cache tags. There is no doubt a good reason, but I don't know enough about that to say that this is a node issue.

    It seems like the cache has a dependency on a node - and that node changes so the cache is correctly invalidated, but whatever these redirects are seem to hang around due to the absence of tags. Now the node has changed that brings about a scenario where the context is then different - and that triggers an error.

    The test WIP I added was just to highlight the difference, not to say that the difference is unexpected.

    Also - as far as I understood the cache context docs, user.node_grants cache context should specify that the item is uncachable - so I'm futher confused how we end up getting cache redirects errors for something that should not be cached.

  • Merge request !11395Issue #3504114: Test to demo error → (Open) created by ericgsmith
  • Pipeline finished with Failed
    28 days ago
    Total: 145s
    #441543
  • Pipeline finished with Failed
    28 days ago
    Total: 703s
    #441544
  • 🇳🇿New Zealand ericgsmith

    Alright, gave this another crack and for whatever reason I need to repeat the unpublish dance to generate the error, and now have the error generating in CI - https://git.drupalcode.org/issue/drupal-3504114/-/jobs/4579941#L1064

    Is this a cache context optimization issue?

  • First commit to issue fork.
  • 🇬🇧United Kingdom catch

    This was previously looked at in 🐛 Access cacheability is not correct when "view own unpublished content" is in use Needs work but the test clearly fails so we must have missed something there. Renamed the test class.

  • Pipeline finished with Failed
    28 days ago
    Total: 515s
    #441731
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Debugging on 11.x I was able to reproduce it. It is coming from the local tasks for nodes under certain conditions after the entity status is changed.

    At a quick glance, the NodeAccessControlhandler definitely adds the node as a cacheable dependency. So if the status changes, anything that was set in there should get invalidated and the error you report should not occur. I'll see if I can reproduce this locally. Might only be tomorrow or next week, though.

    LocalTasksBlock::build() seems to take care with its cacheable metadata, though...

  • 🇬🇧United Kingdom catch

    I pushed a commit that makes the test go green - it is not the correct fix, but I ran out of time and that's where I got to debugging.

  • Pipeline finished with Failed
    27 days ago
    Total: 417s
    #441761
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Managed to reproduce following steps in IS.

    On my latest branch where blocks are placeholdered, it shows the following two errors:

    User warning: Trying to overwrite a cache redirect for "entity_view:block:olivero_secondary_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349" with one that has nothing in common, old one at address "languages:language_interface, theme, user.permissions, route" was pointing to "user.roles:authenticated, user", new one points to "user.node_grants:view". in Drupal\Core\Cache\VariationCache->set() (line 185 of core/lib/Drupal/Core/Cache/VariationCache.php).
    
    User warning: Trying to overwrite a cache redirect for "entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349" with one that has nothing in common, old one at address "languages:language_interface, theme, user.permissions, route" was pointing to "user.roles:authenticated, user", new one points to "user.node_grants:view". in Drupal\Core\Cache\VariationCache->set() (line 185 of core/lib/Drupal/Core/Cache/VariationCache.php).
    

    With both having the same stacktrace of:

    Drupal\Core\Cache\VariationCache->set() (Line: 117)
    Drupal\Core\Render\RenderCache->set() (Line: 121)
    Drupal\Core\Render\PlaceholderingRenderCache->set() (Line: 539)
    Drupal\Core\Render\Renderer->doRender() (Line: 203)
    Drupal\Core\Render\Renderer->render() (Line: 120)
    Drupal\Core\Render\Renderer->Drupal\Core\Render\{closure}() (Line: 593)
    Drupal\Core\Render\Renderer->executeInRenderContext() (Line: 119)
    Drupal\Core\Render\Renderer->renderInIsolation() (Line: 146)
    Drupal\Core\Render\Renderer->doRenderPlaceholder() (Line: 183)
    Drupal\Core\Render\Renderer->renderPlaceholder() (Line: 715)
    Drupal\big_pipe\Render\BigPipe->renderPlaceholder() (Line: 495)
    Drupal\big_pipe\Render\BigPipe->Drupal\big_pipe\Render\{closure}()
    Fiber->start() (Line: 502)
    Drupal\big_pipe\Render\BigPipe->sendPlaceholders() (Line: 254)
    Drupal\big_pipe\Render\BigPipe->sendContent() (Line: 113)
    Drupal\big_pipe\Render\BigPipeResponse->sendContent() (Line: 395)
    Symfony\Component\HttpFoundation\Response->send() (Line: 20)
    

    So that definitely narrows it down to the olivero_secondary_local_tasks and olivero_primary_local_tasks blocks having the wrong cacheable metadata. Both use the local_tasks_block block plugin.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    When the node is unpublished, $links contains:

    \Drupal\Core\Access\AccessResultAllowed::__set_state(array(
       'cacheContexts' => 
      array (
        0 => 'user.permissions',
        1 => 'user.roles:authenticated',
        2 => 'user',
      ),
       'cacheTags' => 
      array (
        0 => 'node:1',
      ),
       'cacheMaxAge' => -1,
    ))
    
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    When we get to NACH::checkAccess()'s last return statement, the access result only has 'user.node_grants:view' cache context on it. This is because the node is published and checkViewAccess() does nothing. But it does also have the node:1 cache tag. So if the node gets unpublished, the entire cache entry for said node should vanish.

    I'm thinking this is a VariationCache bug where it still finds the old redirects.

    This comment is actually not true:

    //  Cache redirects are stored indefinitely and without tags as they never
    // need to be cleared. If they ever end up leading to a stale cache item
    // that now uses different contexts then said item will either follow an
    // existing path of redirects or carve its own over the old one.
    

    In this scenario, we have the following chain:

    array (
      'entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349' => 
      (object) array(
         'cid' => 'entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349',
         'data' => 
        \Drupal\Core\Cache\CacheRedirect::__set_state(array(
           'cacheContexts' => 
          array (
            0 => 'languages:language_interface',
            1 => 'theme',
            2 => 'user.permissions',
            3 => 'route',
          ),
           'cacheTags' => 
          array (
          ),
           'cacheMaxAge' => -1,
        )),
         'created' => '1741266457.894',
         'expire' => '-1',
         'serialized' => '1',
         'tags' => 
        array (
        ),
         'checksum' => '0',
         'valid' => true,
      ),
      'entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349' => 
      (object) array(
         'cid' => 'entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=olivero:[user.permissions]=060b2fe6874d87de20e63c5747f201eba0ae8bc76218084bdefae863baed5349',
         'data' => 
        \Drupal\Core\Cache\CacheRedirect::__set_state(array(
           'cacheContexts' => 
          array (
            0 => 'languages:language_interface',
            1 => 'theme',
            2 => 'user.permissions',
            3 => 'route',
            4 => 'user.roles:authenticated',
            5 => 'user',
          ),
           'cacheTags' => 
          array (
          ),
           'cacheMaxAge' => -1,
        )),
         'created' => '1741266457.895',
         'expire' => '-1',
         'serialized' => '1',
         'tags' => 
        array (
        ),
         'checksum' => '0',
         'valid' => true,
      ),
      'entity_view:block:olivero_primary_local_tasks:[languages:language_interface]=en:[route]=entity.node.canonical02d270e527ee2db38f14edbd305a4fea048b224ab212dd8a734fefc49e4a04ab:[theme]=olivero:[user]=2' => false,
    )
    

    The first redirect never changes no matter how many times we save the node. And because we don't tag redirects it means it will always lead to the second redirect where (if the first visit was unpublished) we have the following contexts:

          array (
            0 => 'languages:language_interface',
            1 => 'theme',
            2 => 'user.permissions',
            3 => 'route',
            4 => 'user.roles:authenticated',
            5 => 'user',
          ),
    

    Now, the 'user.roles:authenticated' and 'user' cache context should only ever be added if the node is unpublished. But because we never flushed the redirects, the cache GET when the node gets published finds the still accurate first redirect, which leads to the stale second redirect and that's where shit hits the fan.

    Oh dear.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Keep in mind that the bug only exists because we have no entity status cache context.

    Basically the node access check outcome doesn't depend on the entity status, it varies by it. So if in NACH::checkViewAccess() we could do the following:

    -    // If the node status changes, so does the outcome of the check below, so
    -    // we need to add the node as a cacheable dependency.
    -    $cacheability->addCacheableDependency($node);
    +    $cacheability->addCacheContexts(['node_status:1']);
        if ($node->isPublished()) {
          return NULL;
        }
        $cacheability->addCacheContexts(['user.permissions']);
    

    Then we would be in the clear because the second redirect would be moved to the third and in its place would be a new redirect which only redirects to said node's status.

    Now the question is whether such a cache context would make sense, because it could require us to load an entity that's not on the route. Here it is, but for argument's sake... And given how cache contexts need to be fast I'm not sure that's such a great idea.

    So the second best thing would be to add cache tags to redirects, but then they might get invalidated all the time, killing the VC performance gain. Because, by the time we get to VC, we have the cache tags of the entire thing and we can't know at what point they became relevant. So we might be tagging the first redirect with node:1 even though the status of said node will never affect it.

    The whole point of not flushing redirects is because everything that could lead to a redirect must be representable by a cache context. In the case of NACH it's not. Oof.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Oh and before anyone panics: There is no immediate risk here.

    The hardening in VC will complain about the redirect, but it will then store the new redirect just fine. The risk is that the goal was to turn the warning into an exception and then we would be in trouble, so we can't do that until we figure this one out.

  • 🇬🇧United Kingdom catch

    I'm wondering if we can have a node status cache context that always returns TRUE (or whatever), but it adds the node ID as the cache tag (possibly without loading it as an optimisation) so that the redirect gets tagged when it's present.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    For this particular case, what we have is this:

    1. BlockViewBuilder::build() builds a render array with keys ['entity_view', 'block', $entity->id()]
    2. The Renderer will automatically add in the required cache contexts:
      • 'languages:language_interface',
      • 'theme',
      • 'user.permissions',
    3. LocalTaskManager::getLocalTasks() adds the 'route' cache context
    4. NACH adds extra cache contexts

    So this leads to a top-level redirect with just the required cache contexts, pointing to 'route' and whatever else the local tasks of the very first page after a cache rebuild specifies. As we visit more pages, the cacheable metadata of the local tasks will vary but always contain route and as such we get to a second redirect through self-healing seen in #15.

    So now we have for cache keys ['entity_view', 'block', $entity->id()]:

    1. The top level cache entry as a CacheRedirect to ['languages:language_interface', 'theme', 'user.permissions'] - This is the case for ALL local task blocks
    2. The second level cache entry as a CacheRedirect to ['route'] - This comes from a massive amount of different routes, not per se node related
    3. A third level cache entry with a CacheRedirect for our node specific cache contexts

    Now imagine we hit the node page first and tag both the first and second tier redirect with 'node:1'. We then visit a few more pages, build more redirects and warm some caches and node 1 gets saved. Boom, the local tasks of all those pages cannot find their redirect path to their cache entry any more and have to calculate their contents anew.

    Trying to be smart about this during self-healing is also impossible because you cannot know at what point the tag became relevant. Set it too early and you clear too many caches, set it too late and you might have a security risk on your hands.

    Now imagine we did have a cache context to represent an entity status. Then we would have the following:

    1. The top level cache entry as a CacheRedirect to ['languages:language_interface', 'theme', 'user.permissions']
    2. The second level cache entry as a CacheRedirect to ['route'] - ANY REDIRECT AFTER THIS POINT IS FOR THE SPECIFIC NODE ON THE ROUTE ONLY
    3. A third level cache entry as a CacheRedirect to 'entity_status:node|1'
    4. A fourth level cache entry with a CacheRedirect for our node specific cache contexts that varied by status

    This is fine, because if we subsequently visit the node 2 page, we would only find the first two redirects and then resolving the route cache context would lead to a miss, at which point we can store a redirect for 'entity_status:node|2'.

    Finally, there's still a very real chance the final page would be tagged with the node cache tag, but if that one gets invalidated, the redirect chain remains intact as it is not affected by any change on the node whatsoever. Status 0 will always point to one CacheRedirect and status 1 to the other. That will never, ever change.

    So I'm starting to really like the prospect of an EntityStatusCacheContext for this scenario. tagging redirects is a no-go and flushing them all whenever the end of the chain is invalidated is just enormously wasteful.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I'm wondering if we can have a node status cache context that always returns TRUE (or whatever), but it adds the node ID as the cache tag (possibly without loading it as an optimisation) so that the redirect gets tagged when it's present.

    You mean a cache context that only serves a purpose to "hint" to VC that it should add some cache tags to the redirect where it's present? Now that would be an awesome alternative. Would have to be its own kind of cache context, though. We should not mix that behavior with regular cache contexts.

    Example with cache context "entity_status:node|1"

    1. Returns TRUE, so there is no performance hit trying to load stuff
    2. Extends an interface that gets aggregated so we have a list of these contexts that VC can compare against
    3. if VC finds it's a "special" cache context when trying to write a redirect, it calls the ::getCacheableMetadata() method of the cache context and adds those tags to the redirect

    There's still the question of what happens if it's present on a node page when all caches are warm. Because at that point we'd try to write a redirect with the following added:

    • 'route'
    • 'entity_status:node|1',
    • 'user.roles:authenticated'
    • 'user'

    If this gets optimized on a second visit to, say, a taxonomy page then self-healing would break it up and store the route redirect. That would erase the node stuff and next time we visit said node we'd store the redirect without 'route' and tag that with the node tag.

    HOLY SHIT THAT WOULD WORK. That's awesome.

    We could even remove the "special" cache contexts from the redirect if we want, but it might be better to keep them for debugging purposes. We could then go as far as to allowing these cache contexts to not have a value at all, only return a set of cache tags.

  • 🇬🇧United Kingdom catch

    You mean a cache context that only serves a purpose to "hint" to VC that it should add some cache tags to the redirect where it's present? Now that would be an awesome alternative. Would have to be its own kind of cache context, though. We should not mix that behavior with regular cache contexts.

    I don't think it needs to be special as such, we already have a getCacheableMetadata() method on CacheContextInterface, the hard thing will be naming. But yeah we could potentially add a new interface that indicates that the ::getContext() method is inert/null op to make them special. Like 'MetadataOnlyCacheContextInterface' or something.

    The other thing I'm wondering also though is does it really hurt to add the 'user.roles:authenticated' cache context when it's not strictly needed - it is extremely low cardinality, just anon and auth. However apart from my initial quick fix MR approach above, I couldn't find the right place to actually add that and still have the tests pass yet.

  • 🇭🇺Hungary mxr576 Hungary

    I am still catching up with this thread, but...

    Example with cache context "entity_status:node|1"

    IIRC @kristiaanvandeneynde you have suggested against doing something similar in #2628870-69: Access result caching per user(.permissions) does not check for correct user -- and fun fact, the code that I copied the comment is available in a public repo since then: https://git.drupalcode.org/project/view_usernames_node_author/-/blob/1.x...

  • 🇬🇧United Kingdom catch

    The cacheable-metadata-only redirect might be useful, but also I think we can fix this in the node access handler.

    https://git.drupalcode.org/project/drupal/-/merge_requests/11395/diffs#n... is viable I think.

    https://git.drupalcode.org/project/drupal/-/merge_requests/11402/diffs was an early start at an alternative approach that is quite broken.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I'm not sure this would work. There's a bug in the MR and the intended solution leads to several cache inefficiencies.

    NACH::checkViewAccess() currently only checks if you can view your own unpublished nodes. It also properly adds any cacheable metadata to the CacheableMetadata object that's passed into it and said metadata is merged in ::checkAccess() wherever we return.

    So not returning early, still running the grants check and then returning the grants check means we killed off access for viewing your own unpublished node. As we do not return said access check anymore. It also means that the following lines do nothing.

          if ($view_access_result) {
            $access_result->addCacheableDependency($view_access_result);
          }
    

    Now, this can be fixed. So let's move on to the changes in ::viewAccess() instead and see what's changed.

    1. In case the node is published, we skip all of the ::viewAccess() stuff and end up adding only 'node_grants:view' if there are implementations
    2. In case the node is unpublished and the user has no permission, we end up adding only 'user.permissions', which does nothing
    3. In case the node is unpublished and the user is anonymous and has the permission, we end up adding 'user.permissions', which does nothing, and 'user.roles:authenticated'.
    4. In case the node is unpublished and the user is authenticated and has the permission, we end up adding 'user.permissions', which does nothing, 'user.roles:authenticated', 'user' and 'user.node_grants:view' if there are implementations. The bold part is new.

    Now 1 soft collides with 2 for someone who can't view own unpublished and grant implementations exist. If we visit a published first and then visit the same node when unpublished, we will overwrite the CacheRedirect we had for 'node_grants:view' with a direct cache entry because it varies by nothing extra. Vice versa if we visit unpublished first and then publish the node, we write a cache redirect and then the cache entry.

    So in principle VC will work fine, it's just slightly inefficient to kill off the redirect path when the status changes or to follow a redirect path that we know will lead to an invalidated item. No big deal, I suppose.

    1 and 3 will still lead to a VC error for anonymous people with the permission while grant implementations exist because, depending on the order, we will either try to overwrite a redirect for 'node_grants:view' with 'user.roles:authenticated' or vice versa. Anonymous people shouldn't get the permission, but still...

    1 and 4 will no longer lead to a VC error for authenticated users with the permission when grants are implemented, because we will always have a redirect for 'node_grants:view' regardless of node status. However, because the node status is still not reflected in the redirect chain, we will see the same inefficiencies we saw between 1 and 2.

    Same story when grants are not implemented, but here we won't see 'node_grants:view' at all and it's merely the difference between nothing extra and some contexts added by checkViewAccess() so it's also the same as the behavior between 1 and 2.

    TL;DR: We still have errors in some ill advised configurations and we definitely still have inefficiencies across the board. The essence remains that NACH is varying by node status but we have no context in core to reflect that.

    If we don't want to support such cache contexts, then the problem becomes a VC problem because now we need to be able to add cache tags to cache redirects and we get to the possible suggestion @catch made.

    After sleeping on it, it does not seem like a VC issue after all because ::checkViewAccess() does add a cache context whenever it varies by something, only it fails to do so for the status check. That is the root of the issue.

    Moving back to node subsystem for that reason. If the solution is to add a status cache context, we can do that in a different issue which is for the cache subsystem.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    You could erase the VC error coming from 1 vs 3 by moving the node grants cache context higher up in checkViewAccess(). That could be a stopgap solution, but we really need to look into an entity status cache context instead. Visiting node pages after changes to said node will now face several extra queries that we ideally should not even run.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Comments #25 and #26 were for MR 11402. Checking 11395 in a few.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay everything I wrote holds true, just the bug I reported in #25 is not present in the other MR. Everything else still fits as the grants cache context is added too late for the stopgap to work.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Given this some more thought and I don't see "hinting" for what to tag a redirect with working.

    Say we have a scenario where we have three critical decision points:

    • First one adds cache context A
    • Second one adds B when published or X when unpublished and a pseudo-context (N:1) to indicate we need to tag with node 1
    • Third one in the X branch can add Y or Z depending on value of X

    Ideally, we want the node cache tag to be added to the redirect for B or X, but how do we know where to add it?

    We could come to a cold cache with AB(N:1), AXY(N:1) or AXZ(N:1) and that would be one big fat redirect pointed to by the initial cache contexts (). So the chain would, for example, be:
    <INIT> --> AB --> <CACHED DATA>

    Now invalidating A whenever the node changes would be silly, but for the sake of arguing let's say we write the tag at the last redirect. If we now change the status, the whole chain is flushed, but no real damage is done. However, if we visit a second node page that triggers XY, the redirect path would be rewritten to:
    <INIT> --> A --> XY --> <CACHED DATA>

    At this point, tagging the last redirect would give us what we want. Only XY and the cached data get flushed when the node status changes and we keep redirect A. But if we now visit a third node that triggers XZ, we get:
    <INIT> --> A --> X --> Z --> <CACHED DATA>

    Now we're in a pickle again. Because we need to tag X with the node cache tag, but how do we know that? When we get to the cache we only see AXZ(N:1), we do not know which context was added in which order.

    So hinting what to tag a redirect with would only work if we had a chronological order of when cache contexts were added and I still think it would be really poor DX to expect people to provide pseudo-contexts whenever the vary by something that is not part of the global context.

    But that would require a complete rewrite of our cacheable metadata add and merge methods.

    So moving on to another option: Source detection.

    If we could tell NACH where the node that's being passed in is coming from, we could add the cache context for whatever global source we used to get the node. In this case it could be route.args:node or something. An example of this is the route.group cache context in Group, but a more generic version of it.

    But that would require some sort of service where we can pass in an object and get a source in return. And if the node was actually loaded at random and passed into NACH, it would still break because we'd have no cache context to represent the status check.

    So that left us with one final solution: A cache context to represent status checks. I am really not a fan of this either because it's just plain dirty. The node in question may be from a global source, but we don't know that. Falling back to loading the node without context is just asking for performance issues and poor cache hit ratio.

    This was also touched on in the issue mentioned in #22.: To add insult to injury, we had a notion of "current user" vs "any user" being discussed there and that could lead to poor cacheability all around.

    Now, I may have got a solution but I'll write that down in a separate comment. This whole comment was to outline why we're stuck.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so moving on from my previous comment, I just thought of something that could make pseudo cache contexts work.

    Imagine a value object PseudoCacheContext that implements CacheableDependencyTrait but:

    • On any method that can add contexts it throws an exception so only cache tags and max age are allowed
    • Its getters for tags and max age would return "nothing" ([] and -1)
    • Its getCacheContexts() method sorts the tags and max-age (for better hit ratio) and puts that into a 'pseudo:some_sorted_string' cache context
    • Said cache context would always return 1 (what @catch suggested before)

    Now in VariationCache, when we detect such a context it will tag the the next chain item after the context with said metadata.

    Now we could take the example of my previous comment and tell VariationCache exactly which redirect needs to be tagged. So no matter what redirect the pseudo-context gets added to, that's where we flag the next redirect for receiving the node cache tag. So from the example above, no matter if the chain looks like:

    • <INIT> --> AXZpseudo --> <CACHED DATA>
    • <INIT> --> ABpseudo --> <CACHED DATA>
    • <INIT> --> A --> pseudo --> XZ --> <CACHED DATA>
    • <INIT> --> A --> pseudo --> X --> Y --> <CACHED DATA>
    • <INIT> --> A --> pseudo --> X --> Z --> <CACHED DATA>
    • <INIT> --> A --> pseudo --> B --> <CACHED DATA>

    We would always tag the one after the pseudo context with the node status.

    The offending code in NACH would then become:

      protected function checkViewAccess(NodeInterface $node, AccountInterface $account, CacheableMetadata $cacheability): ?AccessResultInterface {
        // NO LONGER NEEDED!
        // $cacheability->addCacheableDependency($node);
    
        // We do not know where the node comes from, but we have to vary by its
        // status. We therefore create a pseudo cache context that gets invalidated
        // whenever the node is invalidated.
        $pseudo_context = new PseudoCacheContext($node->getCacheTags(), $node->getCacheMaxAge());
        $cacheability->addCacheableDependency($pseudo_context);
        if ($node->isPublished()) {
          return NULL;
        }
        $cacheability->addCacheContexts(['user.permissions']);
    
    

    This feels like it could work with minimal effort on changing VariationCache. The only drawback is that the pseudo-context will in some cases introduce an extra redirect, but that would be the case anyway if we had a proper "node status cache context".

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The upside of the above is that the DX isn't nearly as bad.

    It's easy to document PseudoCacheContext as:
    "When you need to vary based on a cacheable dependency's property but you have no cache context to represent that, perhaps because you don't know where the dependency came from. In that case, add the dependency to a PseudoCacheContext and pass in the PseudoCacheContext as a cacheable dependency of your return value."

    But with clearer language than what I could come up with just now.

  • 🇬🇧United Kingdom catch

    "When you need to vary based on a cacheable dependency's property but you have no cache context to represent that, perhaps because you don't know where the dependency came from. In that case, add the dependency to a PseudoCacheContext and pass in the PseudoCacheContext as a cacheable dependency of your return value."

    Could this be called CacheableDependencyCacheContext - a cache context that varies by its cache dependencies?

    If so I have one more question:

    Is there a forseeable situation where someone might want to make a similar cache context that also returns something? If so we could add CacheableDependencyCacheContextInterface and look for that in VariationCache, but otherwise just add the new object as described in #31.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Could this be called CacheableDependencyCacheContext - a cache context that varies by its cache dependencies?

    Sure, we can name it whatever makes most sense.

    But it's important to state it does not vary by the dependency, it's just that we have a dependency that causes variations which we cannot represent with a regular cache context so we need to be able to invalidate a part of the redirect chain to reflect the dependency.

    Is there a forseeable situation where someone might want to make a similar cache context that also returns something?

    I don't see a use case where the cache context would return any value, really. It will essentially serve as a substitute for a redirect chain link that we cannot otherwise declare. By virtue of that it does not make sense to have it return anything other than a fixed value (like 1).

  • 🇬🇧United Kingdom catch

    But it's important to state it does not vary by the dependency, it's just that we have a dependency that causes variations which we cannot represent with a regular cache context so we need to be able to invalidate a part of the redirect chain to reflect the dependency.

    Hmm this is true, it doesn't vary by what the dependency is, it just varies to the extent that the cache context is present (or not), and then adds invalidation on top. Maybe naming is still fine with that caveat though.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Just added the cache context and value object for review. Still need to implement in VC and the node access control handler as proof-of-concept.

  • Pipeline finished with Failed
    3 days ago
    Total: 170s
    #461403
  • Pipeline finished with Success
    3 days ago
    #461422
  • 🇬🇧United Kingdom catch

    Looking at this again and it feels almost impossible to explain depending on a cacheable dependency vs. varying by the property of a dependency.

    Just to check I extracted one of the previous quick fixes here, which compensates for the short circuit when the node is unpublished. I'm not 100% sure we should even be bypassing the node grants system when access is granted via view own unpublished.

  • Pipeline finished with Success
    3 days ago
    Total: 519s
    #461427
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I'd say the following:

    • Are you displaying something? --> dependency
    • Are you varying based on something? --> cache context we know (global) source of something, pseudo cache context if we don't

    More details and edge cases to be added, but that would be the gist of it.

    So take the status field for example:

    • If it means you show the node background color in pink or not, then IMO that's not a variation and you can simply add the node as a dependency.
    • If you vary access based on the status, we do not necessarily care about what will be output in that context. We do care about the status as part of a decision tree on what to return. So here it's a variation.
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    The MR you opened still has the following going for it:

    1 and 3 will still lead to a VC error for anonymous people with the permission while grant implementations exist because, depending on the order, we will either try to overwrite a redirect for 'node_grants:view' with 'user.roles:authenticated' or vice versa. Anonymous people shouldn't get the permission, but still...

    It also means we check the grants cache context where it doesn't apply. But that's only for one out of many configurations of status vs owner vs permissions. So I'd be able to live with that as collateral damage.

    However, this is the first time we encountered this bug and we cannot know whether we might encounter it more. We may not always have the luxury of adding an unrelated cache context just to fix cache redirect collisions. So a proper solution is still needed to be safe and that is what I was trying to come up with.

  • 🇬🇧United Kingdom catch

    We could maybe split my 'quick fix' MR out to a side issue, try to get the committed (with the test coverage) and then continue here?

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Sure why not

Production build 0.71.5 2024