Improve query and cache API so that render() doesn't have to be called to add query cache metadata

Created on 28 March 2025, 4 months ago

Problem/Motivation

The node module has very old code to add a cache context for the current user's node access grants inside a query alter.

This was added in #2557815: Automatically bubble the "user.node_grants:$op" cache context in node_query_node_access_alter() β†’

The goal is obviously to not cache and re-use a query result for a user with different node access grants to avoid information disclosures.

However, the way it's accomplished is using the renderer to render an array with just cache metadata based on wether the current request represents a cachable page. Per @catch, this was "a kludge 10 years ago (along with one or two others) to unblock dynamic_page_cache and big_pipe".

A further reflection of the problem is that for REST plugins you have to execute all your node, etc. queries in a render context so that your REST response data has the correct cache metadata.

Proposed resolution

In core there should be a mechanism to add cache metadata (e.g. cache contexts or tags) to Select and Entity queries and core should handle adding that cache metadata as part of the cache context for the current response if needed.

Possibly this should take the form of adding a callback instead of the actual data since in the current node module implementation we avoid calculating the user's node access grants if the request is not going to be cached.

This is a little tricky since we don't know what happens to the data once the query is executed, so possibly you'd have to handle this in the execute() method. That also seems like a kludge.

Remaining tasks

Decide on the architecture including how this data is added to a query and where core is going to collect the data
Define on a mechanism other than calling the renderer to add cache data. This need to handle

API changes

Change the way the query systems interact with the cache system, TBD

✨ Feature request
Status

Active

Version

11.1 πŸ”₯

Component

cache system

Created by

πŸ‡ΊπŸ‡ΈUnited States pwolanin

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @pwolanin
  • πŸ‡³πŸ‡ΏNew Zealand quietone

    Changes are made on on 11.x (our main development branch) first, and are then back ported as needed according to the Core change policies β†’ .

  • πŸ‡¬πŸ‡§United Kingdom catch
  • πŸ‡¬πŸ‡§United Kingdom catch

    If we add a cache context to a query, then only variation cache / render cache will actually cause that context to be calculated (once it's bubbled up to something that going to be written to cache), until that point, it's just a string in an array like ['contexts' => 'user'], so we should be fine to add contexts and tags directly and not need a callback system.

  • πŸ‡¬πŸ‡§United Kingdom catch

    I think we need something like ::addCacheableMetadata() and ::getCacheableMetadata() on the query objects (entity and database), and then the code that executes the query is responsible for getting the cacheablemetadata and merging it into the render array.

  • Does it makes for \Drupal\Core\Database\Query\Select (and maybe \Drupal\Core\Database\Query\SelectExtender) and \Drupal\Core\Entity\Query\QueryBase to implement \Drupal\Core\Cache\RefinableCacheableDependencyInterface and use \Drupal\Core\Cache\RefinableCacheableDependencyTrait? There's no getCacheableMetadata(), but there are getter methods for context, tags, and max-age, and you can also then merge the query object as a cacheable dependency to any other cacheable dependency object.

  • πŸ‡¬πŸ‡§United Kingdom catch

    That sounds promising.

  • Pipeline finished with Success
    30 days ago
    Total: 687s
    #534119
  • Put up draft MR 12535 implementing #6. Added a kernel test which essentially tests what Drupal\Tests\node\Functional\NodeAccessCacheabilityTest::testNodeAccessCacheabilitySafeguard() does, but without rendering. There probably should be additional test coverage for more generic usage.

    A couple questions came to mind:

    1. Should Drupal\Core\Database\Connection\SelectInterface and Drupal\Core\Entity\Query\QueryInterface extend RefinableCacheableDependencyInterface? It would eliminate the need to do instanceof checks, but there are BC considerations
    2. Implementations of EntityStorageInterface::loadByProperties() use entity queries under the hood. Should there be an optional second parameter on loadByProperties() to capture cacheable metadata?
  • Should there be an optional second parameter on loadByProperties() to capture cacheable metadata?

    Then there's the EntityRepository method loadEntityByUuid method that calls EntityStorage::loadByProperties, and possible other rabbit holes, so I don't know if it makes sense to add an additional parameter everywhere, at least not in scope for this issue.

  • πŸ‡¬πŸ‡§United Kingdom catch

    #9.1 does the Select class count as a base class here? It doesn't have the Base keyword at the end but that's probably because it was originally added in c.2008. If it does, then I think we can add the interface to SelectInterface directly because Select will then implement the methods for any subclasses.

    #9.2 based on #10 that sounds like good material for a follow-up but we should open one :)

  • Pipeline finished with Success
    28 days ago
    Total: 849s
    #535367
  • Pipeline finished with Success
    28 days ago
    Total: 805s
    #535378
  • Yes, it looks like Select is essentially a base class. SelectExtender also implements SelectInterface, but it is essentially a base class as well. And for entity queries, QueryBase is the base class for QueryInterface. Updated the MR to have the respective interfaces extend RefinableCacheableDependencyInterface and added a CR [#3533258].

    The SQL entity query class merges the cache metadata from the underlying Select object, so this works for content entities. Config entity queries or other entity types using key value storage don't capture any cacheable metadata from SQL key value queries. Doing so would add complexity to key value and this seems like an edge case.

    Follow up per #11: ✨ [PP-1] Make it possible to capture cacheable metadata from underlying entity queries in EntityStorageInterface::loadByProperties() Active .

    Probably still needs some more test, so leaving in NW and the Needs tests tag.

  • Pipeline finished with Success
    28 days ago
    Total: 450s
    #535473
  • Pipeline finished with Success
    26 days ago
    Total: 578s
    #537526
  • OK, added more general tests. I also removed the NodeAccessCacheability kernel test I'd added before, because I think it's duplicating other tests mostly at this point, and instead added asserts in the functional test to test that the query tags get added to the render array when the query is not running in a render context.

    Additionally, I found dead/broken test code #2878483: loadEntityByUuid() should skip access checks β†’ in query alter hook in EntityTestHooks.php, and I removed all that and reused the hook for tests needed here.

    One last thing:
    It occurred to me that maybe entity query objects should include the corresponding entity type list cache tags and entity type list cache contexts. However, when I tried that the list cache contexts for the Node entity type are ['user.node_grants:view'], which would kinda defeat the point of all the logic in the query alter hook only to add those contexts conditionally.

    Separately I was also wondering if the page cache contexts should be added to PagerSelectExtender, but I haven't thought that through completely.

    I think the MR is ready for review regardless.

Production build 0.71.5 2024