Introduce a list of "common cache tags" to reduce lookup query amount

Created on 26 March 2024, 9 months ago
Updated 27 June 2024, 6 months ago

Problem/Motivation

Whenever a cache entry is retrieved from the database, we check what tags were added to it and retrieve the invalidation count for said tags from the cachetags table. This leads to multiple queries per request where several tags are often repeated across (almost) all requests.

It has even led to other DX issues being deemed inadvisable because of this drawback: 📌 Allow plugin derivers to specify cache tags for their definitions Postponed: needs info

Steps to reproduce

Run the StandardPerformanceTest, witness the high counts in the following assertion:
$this->assertSame(COUNT_GOES_HERE, $performance_data->getCacheTagIsValidCount());

Every call in that count has the potential to trigger a query in DatabaseCacheTagsChecksum::getTagInvalidationCounts()

Proposed resolution

From the other issue:

What if we build a simple system that allows us to either specify or identify commonly used cache tags and, at the very first call to ::calculateChecksum(), we load ALL of them with one query. This would both improve the performance of Drupal as it stands and make the current issue less of a hassle because we could (but don't have to) add whatever cache tag our deriver specifies to the "common" list.

I feel like this wouldn't necessarily have to be a smart system that tries to automagically deduce which tags belong to the list, but could rather be a very rudimentary system where both core and contrib can specify tags to add to the list. We could then even have the entity type manager make sure that all entity type's list tags are added to said list, for instance, but that's up for discussion.

  1. Look at Grafana to see all of the queries like SELECT [tag], [invalidations] FROM {cachetags} WHERE [tag] IN ( :tags[] )
  2. Create an event dispatcher for "common cache tags", write an event subscriber that adds the common cache tags to the list.
  3. Optionally create another subscriber that depends on the entity type manager and adds all list cache tags
  4. Get the result of this in DatabaseCacheTagsChecksum::getTagInvalidationCounts() and on the first call merge these into the ones we're taking to the DB

Remaining tasks

We might need to have another look at PerformanceTestTrait::isDatabaseCache() because the call amount to ::getTagInvalidationCounts() will not be reduced, but the query count should. So we might want to start tracking these queries again somehow.

User interface changes

None

API changes

New event to declare common cache tags

Data model changes

None

Release notes snippet

None?

Feature request
Status

Needs work

Version

11.0 🔥

Component
Cache 

Last updated 3 days ago

Created by

🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @kristiaanvandeneynde
  • Pipeline finished with Failed
    9 months ago
    #129526
  • Pipeline finished with Failed
    9 months ago
    Total: 131s
    #129527
  • Status changed to Needs work 9 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    This already proves that our isValid count doesn't translate 1:1 to database queries. Perhaps if this goes green it should already be committed so that the work being done here can more easily prove the count difference.

    Even if we don't commit this in its current form, we can still compare results later on to the following.

    • testFrontPageAuthenticatedWarmCache() count 6
    • testFrontPageHotCache() count 1
    • testAnonymous() counts 33, 26 and 24
    • testLogin() count 14
    • testLoginBlock() count 29

    Marking as NW because it doesn't contain the actual fix yet.

  • Pipeline finished with Success
    9 months ago
    Total: 665s
    #129529
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Here are the common stand-alone cache tags I found in Grafana:

    • CACHE_MISS_IF_UNCACHEABLE_HTTP_METHOD:form
    • config:core.extension
    • config:search.settings
    • config:system.site
    • config:user.role.anonymous
    • config:user.role.authenticated
    • entity_bundles
    • entity_field_info
    • entity_types
    • http_response
    • library_info
    • local_task
    • rendered
    • route_match
    • routes
    • views_data

    Then there were some entity type related tags we should put in a separate subscriber:

    • block_content_view
    • block_view
    • node_list
    • node_values
    • node_view
    • taxonomy_term_list
    • user_values
    • user_view

    Then we had some tags related to the following that could be added dynamically in follow-up issues for new event subscribers:

    • Image styles
    • Placed blocks
    • Menus
      • config:system.menu.account
      • config:system.menu.main
    • Text formats
      • config:filter.format.basic_html
      • config:filter.format.full_html

    IMO we should start with the common ones and the entity type related ones as those do not need to be cleared at all unless a cache clearing event takes place (new module, new entity type, ...) The event subscribers based on config or entities (menus, blocks, formats, image styles, ...) need to somehow clear the "common cache tag" list whenever the list changes.

    The cached list of common cache tags should be manually cleared by these event subscribers when necessary rather than use cache tags itself because that will get really nasty really quickly :D

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Latest commit does not introduce the event stuff yet, but shows a proof of concept where we hard-code the common list from above. The results are amazing:

    • testFrontPageAuthenticatedWarmCache() count 6 -> down to 2
    • testFrontPageHotCache() count 1
    • testAnonymous() counts 33, 26 and 24 -> down to 20, 17 and 16
    • testLogin() count 14 -> down to 3
    • testLoginBlock() count 29 -> down to 18

    That's -67%, -39%, -35%, -33%, -79% and -38% amount of queries.

  • Pipeline finished with Success
    9 months ago
    Total: 481s
    #129637
  • Pipeline finished with Failed
    9 months ago
    Total: 576s
    #136166
  • Pipeline finished with Failed
    9 months ago
    Total: 130s
    #136288
  • Pipeline finished with Failed
    9 months ago
    Total: 167s
    #136289
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Switched to an event subscriber. left a bunch of @todo comments.

    • We might want to cache the outcome of the event.
    • The entity type one is still hard-coded but should be dynamic
    • This might need tests

    Reason I'm using:

        calls:
          - [setEventDispatcher, ['@event_dispatcher']]
    

    Is because if I tried to add it as a proper argument Drupal wouldn't even boot because DrupalKernel::$defaultBootstrapContainerDefinition contains cache_tags_provider.container and at that point we do not have a container yet, so event_dispatcher doesn't resolve. Trying to add 'event_dispatcher' to that array just caused more errors where the autowired 'session' closure wasn't being passed in. Using a setter made all of these issues go away.

  • Pipeline finished with Success
    9 months ago
    Total: 726s
    #136560
  • 🇬🇧United Kingdom catch

    I opened 📌 Add JSON:API performance tests Active based on the discussion in the MR.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so a local attempt at getting this to work with entity types dynamically was running into eternal loops, but I just had a chat with @catch and we think we may have a solution for that:

    We fire an event early on that asks for this list of cache tags, when/if we find this list in the bootstrap cache we set it on the container. In the CacheTagsChecksumTrait, we check for said parameter and don't do anything special if it's not there. This will prevent eternal loops.

    Then, we can have the entity type manager load all the info it needs and spit out a list of common cache tags. In the end, we cache said list without cache tags (because here be monsters), but with a max-age in the bootstrap cache. Because it's just an optimization, we don't need to worry that much if this cache becomes stale for a short while; next cache clear or after the list expired it will be recalculated anew.

Production build 0.71.5 2024