Enable an entity query's return value to carry cacheability

Created on 28 January 2019, almost 6 years ago
Updated 18 September 2023, over 1 year ago

Problem/Motivation

>3 years ago, #2557815: Automatically bubble the "user.node_grants:$op" cache context in node_query_node_access_alter() introduced automatic cacheability bubbling for node_query_node_access_alter(). But ideally, this wouldn't need to bubble (onto the currently active render context if any) at all, ideally this cacheability would just be associated with the query itself.

We encountered this in the API-First initiative too, in #3005826: Follow-up for #2984964: JSON API + hook_node_grants() implementations: count queries still result in cacheability metadata leak .

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Feature request
Status

Active

Version

11.0 🔥

Component
Entity 

Last updated 1 day ago

Created by

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Just stumbled upon this issue, but my 2c is also that we should make the bubbling of cacheability a separate service. Right now it's deeply baked into the renderer, like VariationCache was once baked into the render cache. If we abstract the bubbling into a separate service, then other non-rendering pieces of logic could also call the service to add cacheability to the whole. We'd be able to get rid of methods like Renderer::executeInRenderContext().

    Having said that, the challenge would be scoping this (like we currently have render contexts) so that unrelated access checks on the whole request do not poison the cacheability of smaller render arrays on the page, leading to everything being nigh uncacheable because there are too many tags and contexts involved.

    Perhaps we could keep the render context concept for the time being, but introduce this one global context as a service where everyone can add to and then have the renderer also add its bubbled cacheability to said global context. Kind of like what it does already, but then abstracted out for the top level.

  • 🇩🇪Germany geek-merlin Freiburg, Germany

    @kristiaanvandeneynde: Good thoughts!

    AND: You mentioned the core problem:
    > Having said that, the challenge would be scoping this...

    I once read the whole issue and story about why we have that render contexts. It was a pragmatic solution because adding cacheability to some strings would have had a too big impact. (I remember some render array attributes that do not eat renderables.)

    So we should take the opposite direction: Allow every value to carry cacheability. E.g. by adding a stringable renderable, and fix the remaining places where this causes cough.

    That said, what are the use cases you got in mind where things that are not renderable should impact the response cacheability? And why not simply add those cacheabilities to the redner array?

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    That said, what are the use cases you got in mind where things that are not renderable should impact the response cacheability? And why not simply add those cacheabilities to the redner array?

    Any sort of access check on the route or simply at any level before the renderer. Keep in mind access checks don't always rely merely on user.permissions which gets added anyway, but can also rely on domain, country of origin, an external system, etc.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Example: In the footer of an astrophysicist's blog we calculate today's distance between Earth and the Sun because why not. This calculation takes a lot of resources and we do not want to recalculate that for every page and every variation of said page, all we need is a cache context representing the day of the year.

    The good news is that because of cache keys, we can cache smaller pieces of the page like these separately in the render cache. The bad news is that the render always adds the required_cache_contexts even to smaller pieces. So even if our distance widget does not vary by permissions, we still add 'user.permissions', 'languages:language_interface' and 'theme' to the cache contexts. See (coming from Renderer):

        if ($is_root_call || isset($elements['#cache']['keys'])) {
          $required_cache_contexts = $this->rendererConfig['required_cache_contexts'];
          if (isset($elements['#cache']['contexts'])) {
            $elements['#cache']['contexts'] = Cache::mergeContexts($elements['#cache']['contexts'], $required_cache_contexts);
          }
          else {
            $elements['#cache']['contexts'] = $required_cache_contexts;
          }
        }
    

    Now, there's a good reason we have these required cache contexts because if we relied on people properly adding them, we'd have security reports all across contrib and perhaps even core.

    But here's the thing, if we want to redo how cacheability is added to a response, we need to be careful that we don't poison these small unchanging blocks even more. Adding more (global) cache contexts to any render array in a response would mean that our block that in theory only varies by day of the year in practice varies by interface language, permissions, theme, day of the year and whatever cache context you added on a higher level.

    TL;DR: It's really important that whatever solution we come up with does not get tacked onto render arrays, but rather that the total cacheability from renderRoot() makes it into a new service that at the end of a response puts all bubbled/gathered cacheability on the response object instead.

  • 🇨🇭Switzerland berdir Switzerland

    I'm quite confused about the discussion in this issue and how it's related to entity queries.

    The issue title is currently "Enable an entity query's *return value* to carry cacheability".

    Which as being discussed here is tricky, it's not even always an array, it can be a count query then it's just a number, or the aggregate version has it's own return structure.

    However, I think we could consider to just make the entity query object itself implement CacheableDependencyInterface, then you can use that and add it to your cacheability metadata. There are still some tricky things, like how the inner SQL query object and alter hooks then attach their stuff there, but I think we can find solutions for that.

    Whether or not that metadata is then automatically/force-added to the global render context is IMHO a different question and should be its own issue. Doing that is in general just a workaround and hack if we don't have a way to pass things through properly.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I'm quite confused about the discussion in this issue and how it's related to entity queries. The issue title is currently "Enable an entity query's *return value* to carry cacheability".

    Adding cacheability to an array of IDs (the return value) seems hard to pull off without breaking BC. Then there's the question of what to do with said cacheability as, currently, nothing would consume it. Finally, I'm not sure what the benefit would be of a cacheable return value as we'd somehow have to reliably convert the entire entity query into a cache ID with contexts (because the return value may change based on several factors).

    Would it not be better to not cache entity queries and instead cache the thing you used them for? Be that a render array, an access check or whatever.

    There are still some tricky things, like how the inner SQL query object and alter hooks then attach their stuff there, but I think we can find solutions for that.

    This might be really hard to pull off, I'm genuinely curious as to how you'd approach this.

    Whether or not that metadata is then automatically/force-added to the global render context is IMHO a different question and should be its own issue. Doing that is in general just a workaround and hack if we don't have a way to pass things through properly.

    Fully agree, from the early comments on the discussion here went in that direction and I must admit I also simply rolled with it :)

    Maybe we should ping Wim and ask what the goal of this issue was? It's a bit muddy to me to say the least.

  • 🇨🇭Switzerland berdir Switzerland

    > Adding cacheability to an array of IDs (the return value) seems hard to pull off without breaking BC.

    Yes, which is why I'm saying we should use the whole query (builder) object for that purpose, then you can do CacheableMetadata::fromArray($my_build_array)->merge($query);. You can't 100% chain-call it then so you still have access to the the object, but that should work.

    > Then there's the question of what to do with said cacheability as, currently, nothing would consume it.

    See above, consuming it is the responsibility of the specific implementations that run an entity query and then put the results in a render array or something as long as we don't force-bubble it, which imho we shouldn't.

    FWIW, there are actually very few cases that qualify for that in core, as most listings are views or methods/functions that get things from something else than an entity query. Or the horrible node_title_list() that should have been removed a long, long time ago ;)

    > This might be really hard to pull off, I'm genuinely curious as to how you'd approach this.

    Maybe not that hard, query alters have contexts, see what I'm doing with https://www.drupal.org/project/drupal/issues/3188914 🐛 ContentEntity migration source doesn't consider the migration map Needs work for example. If we pass in the query builder object down as context, alter hooks can add cache contexts or tags to it.

  • 🇪🇸Spain pcambra Asturies

    Just adding this comment for discoverability, when you add accessCheck(TRUE) to an entity query, as per #2287071: Add cacheability metadata to access checks , there's cache metadata and that can cause the dreaded LogicException: The controller result claims to be providing relevant cache metadata, but leaked metadata was detected error.

  • 🇨🇭Switzerland berdir Switzerland

    Yes, I suppose that's kind of expected, what you need to do is do your thing in a render context and then collect the cacheability metadata from that. It's an unfortunate workaround due to this issue not being implemented and then you need to handle that when not already working in a render context.

    The idea here is that we could avoid that by attaching and collecting that on the $query object and then you still need to do the same thing, but in a nicer way.

Production build 0.71.5 2024