Cache tags grows endlessly

Created on 28 November 2019, almost 5 years ago
Updated 24 January 2024, 10 months ago

Problem/Motivation

Drupal core ships with cache tags support - awesome.
To ensire that tags for an item were not invalidated on cache read a CacheTagsChecksumInterface service is used (eventuially) and D8 core provides such a service as a DB variant.

My issue is with the cache tags sub-system internals. As it is implemented, the list of cache tags on the platform will grow endlessly due to the DatabaseCacheTagsChecksum implementation.

In the following scenario with a highly-volatile custom entities - add 100k instances, delete them, add new 100k back.

The system will end up with a 200k cache tags in the table, 100k of them will not be used again ever. They will just stay there and clutter the database and cause overall slow-downs. Imagine when this process continues for a while...

Even after a full cache clear (that happens only rarely) all tags are still kept there.

In my case the cache tags is the highest throughput and biggest time consumer as a DB query in the whole of system, even though it's fast on average. I have around 40-50k valid entities in the system and around 120-130k cache tags in the table.

I think this problem affects only the DB implementation, as Memcache and Redis (if they have implementations on the interface) will scale in O(1) compared to Log(N) based on the amount of data in the system. On top of that they have a robust garbage collection mechanisms in case of memory pressure. SQL databases have none of that.

Proposed resolution

Can we have the list of cache tags on the portal truncated when the whole cache get's cleared.
As I suspect this should not be a problem, as the whole set of cache was just invalidated either way.
I think cache tags should be deleted whenever content is deleted on the system.

We should also consider deleting cache tag entries whenever the related entity is deleted as well (if possible). For example: delete of node with ID 1 will delete the node:1 tag from cachetags as well. Cache check-sums that depend on it will be invalidated based on the checksum, as the counter will not be valid (0) instead of anything that was present before.

Any other ideas are welcome.

Remaining tasks

Discussion, decision, patch...

User interface changes

None.

API changes

TBD. None expected.

Data model changes

TBD. None expected.

Release notes snippet

TBD.

πŸ“Œ Task
Status

Active

Version

11.0 πŸ”₯

Component
CacheΒ  β†’

Last updated about 5 hours ago

Created by

πŸ‡§πŸ‡¬Bulgaria ndobromirov

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States SamLerner

    Could someone describe what the potential negative impact is to clear the entire cachetags table?

    Or, how do you determine what tags are no longer in use? Check each one to see if it references deleted stuff? Could we use that to run a cleanup command on a regular basis?

  • πŸ‡¨πŸ‡¦Canada gapple

    As I understand, there is no problem if cachetags are cleared at the same time as a full cache flush - a new cachetag entry will be added to the database as needed when a tag is next invalidated. If done automatically when flushing the cache, it would just make the operation take longer.

    If tags are cleared without clearing cache items, there is a specific, probably unlikely, case where a cached item could have the same checksum and be out-of-date but still served.
    The checksum is a sum of the invalidations on the item's tags, so if:
    - Item saved with: A:1,B:0 (checksum 1)
    - tags are cleared, so invalidations are reset to 0
    - B is invalidated
    - the cache item is valid against the new checksum of A:0,B:1 despite being out-of-date

    The checksum is checked for equality, so a lower checksum from the database will still invalidate the cache item (e.g. an item with two tags' checksum is 12, but the cachetags in the database have both been reset to 0 - since 0 != 12 the item will be invalid).

  • πŸ‡³πŸ‡ΏNew Zealand jweowu

    IIUC (see #12) the main problem will be that you cannot (reliably) clear all cachetags "at the same time" as clearing the caches.

    Even trusting that all cache backends will reliably behave the same way, clearing caches invokes hooks so that modules can react and clear their data, and I assume there's no guarantee that along the way some of that arbitrary code doesn't cause something to be cached and invalidated.

    If all of your caches were in the database and you knew where they all lived and you executed a single transaction which (only) truncated all the cache tables, deleted any other data which was required, and deleted the cachetags then I imagine that would be fine; but in practice I don't think "clearing all caches" is nearly so clean a process.

    You can't naively purge cachetags before clearing other caches, because cache entries may be needed in the process of clearing caches.

    You can't naively purge cachetags after clearing other caches, because in the process of clearing caches you may have acquired and invalidated new cache entries.

    (n.b. This is my speculation -- I don't have a deep understanding of these processes so I might be wrong, but thinking about the issue had led me to those conclusions.)

    I expect we need a way to mark the pre-existing cachetags prior to the full purge, so that entries which are unchanged following the purge can be recognised and removed.

  • πŸ‡ͺπŸ‡ΈSpain eduardo morales alberti Spain, πŸ‡ͺπŸ‡Ί

    Some questions are related here with Vollatile Custom Entities, all entities' cache tags are invalidated on post delete, which makes sense if you are using these entities on cached places, like the case of Nodes, but if those caches tags are not used anywhere, it is better to override the method postDelete to avoid add new entries to the database,

    Method postDelete EntityBase.php:

      public static function postDelete(EntityStorageInterface $storage, array $entities) {
        static::invalidateTagsOnDelete($storage->getEntityType(), $entities);
      }
    

    Overriding the postDelete method will not invalidate the tags on delete.

  • Hi :)

    I use Drupal with OVH since 5 years, all is very nice with Durpal, but complex as you know.... But I have ONE problem since 5 years, all time the same, and 5 years I looking for a solution for that...

    My SQL become big all time.... WIth PHP 8 I need to Clear Cache all days or my DATA SQL go to more 8giga... ANd I am block all time ba my hosting.... :(

    SOmetime ca explain me a solution please? 5 Years I try....

    I am on Drupal 9.5

    Thx. :)

  • πŸ‡¬πŸ‡§United Kingdom catch

    Replying to #26: Cache tags can pretty reliably be cleared after bins are emptied. If a new cache item is created and not invalidates, its cache tag checksum would be 0 which will match the cache tag blank slate.

    If it's created and then immediately invalidated, it's cache tag checksum will be 1 or more which will not match 0, and then it would be invalidated when next retrieved.

    The only case that is not covered is if it's invalidated, has a checksum of e.g. 1, the cache tags are wiped, it is not requested, then they're invalidated in such a way that the checksum matches again and its only requested again then. This is such an extreme edge case/race condition that it can probably be ignored.

    A workaround would be a cache tag last garbage collection timestamp to compare against and write that alongside the cache items. Then anything created before it would also be wiped but that adds an extra state or k/v request on every request.

  • πŸ‡³πŸ‡ΏNew Zealand jweowu

    > This is such an extreme edge case/race condition that it can probably be ignored.

    I can only assume that wasn't the conclusion when this was implemented, though -- it doesn't seem plausible that this issue never came up in discussion at the time.

    > A workaround would be a cache tag last garbage collection timestamp to compare against

    Yeah, I was also pondering that approach in #12 and #13. I think it's a pretty reasonable idea.

    An alternative would be to have a pre-purge process which copied the cachetags table to a temporary table, and then post-purge deleted all cachetags matching an entry in the temp table. That has its own noteworthy costs, but they're isolated to the time of the purge. If the table was really gargantuan, though, it might not be great (and at the point in time when a fix is deployed, some sites are inevitably going to have tables fitting that description). Offhand I think I'd lean towards the timestamp column.

  • πŸ‡³πŸ‡ΏNew Zealand jweowu

    Amavi: Is that on account of the cachetags table specifically? If a normal cache rebuild fixes things for you -- if temporarily -- then it's definitely not about cachetags (as the entire reason for the present issue is that cache rebuilds do not purge the cachetags table).

    Your database will have many different cache tables, and I suspect your problem is something different to this issue. You should start by confirming which specific cache table(s) are getting so large, and then you can look for or post an issue related to that.

  • πŸ‡¬πŸ‡§United Kingdom catch

    #636454: Cache tag support β†’ is the origialnal issue. I worked on it at the time, haven't reread it for ages, but it would not surprise me if purging just didn't get discussed or got deferred to a follow-up that never happened. It was my more or less the first API addition in Drupal 8 and years before a release so that particular problem was very abstract for a very long time.

  • πŸ‡³πŸ‡ΏNew Zealand jweowu

    A more-palatable variant of the temporary table suggestion has occurred to me. I don't know how practical it is, but in principle I think this avoids the down sides of the other approaches mentioned.

    In essence, while the cache rebuild is taking place, new cache invalidations get written to a temporary table. Then, after the cache rebuild, the cachetags table is truncated, and the rows of the temporary table are inserted.

    In order for that to work, cache lookups need to know about the temporary table, something like:

    if (a cache rebuild is in progress) {
      check the temporary table for cache validity
      if (the temporary table contained a row for that cache id) {
        return result;
      }
      else {
        // nothing about this ID in the temporary table
        check the regular cachetags table
        return result
      }
    }
    else {
      // not currently rebuilding the cache
      check the regular cachetags table
      return result
    }
    

    And similarly, when invalidating a cache entry the new invalidations value written to the temporary table would be an increment of the value in the temporary table if a row existed there already, and otherwise an increment of the row from the original cachetags table.

    The lookups during cache rebuilds could be on a join of the two tables, rather than two separate look-ups.

    No timestamp column needed, and no wholesale copying of cachetags; and outside of cache rebuilds the behaviour can be much the way it is at present.

    Is that practical? I won't be surprised if I'm missing something, and I haven't thought through the ramifications of multiple simultaneous cache rebuilds (if that's currently permitted to happen), but it seemed worth suggesting.

  • πŸ‡¬πŸ‡§United Kingdom catch

    I think we should just add cache tag purging as part of a drupal_flush_all_caches() step, immediately after emptying the bins, and document + open a follow-up for the potential race condition where a cache item is both set and and invalidated but then not requested until after a further invalidation again, the chances of that happening are miniscule but the potential issues deriving from storing timestamps or creating temporary tables will affect everyone.

    Also I think it's worth looking at starting and ending a database transaction in drupal_flush_all_caches() in case that's viable.

  • Got the same issue but on a larger scale on a site with lots of webform submissions, around 3.5M+ and for each submission there's an entry in the cachetags table, currently sitting at a bit more than 4M rows.

  • πŸ‡¬πŸ‡·Greece mariaioann

    I have the same problem with Message entities. I am creating messages for as notification for a large number of users, but they are being purged where they are 30 days old. Currently, we have 400K messages, but the cachetags table has 23M entries for messages, as it included all deleted messages as well.
    Is it safe to delete the relevant cache tag on a message postDelete hook?
    What I have not understood is why don't we delete an entity's cache tag when the entity gets deleted in general?

  • πŸ‡¬πŸ‡§United Kingdom catch

    What I have not understood is why don't we delete an entity's cache tag when the entity gets deleted in general?

    Cache tag storage and implementation is swappable, so there's no inherent concept of a cache tag existing as a row in a database with a counter. It would for example be possible (but very slow) to store the cache tags with the cache items, and query all cache items when a cache tag is invalidated, and have no dedicated cache tag storage at all. Because of this, there's no concept of the tag as a thing existing in itself, the checksum implementation that core uses introduces the counter system on top of string tags, but consumers of the cache API, like the entity system, don't need to know about it - they just get/set/delete cache items and invalidate tags.

    If you have a high traffic site with a lot of users/content, you should strongly consider using https://www.drupal.org/project/redis β†’ , which won't run into this problem because it evicts items when it runs out of memory. This is a good idea for lots of reasons other than a large cache tags table.

  • πŸ‡§πŸ‡ͺBelgium wim leers Ghent πŸ‡§πŸ‡ͺπŸ‡ͺπŸ‡Ί

    @MariaIoann

    1. Nothing prevents you from saying "my entity does not need cache tags". See \Drupal\Core\Entity\EntityInterface::getCacheTagsToInvalidate() and \Drupal\Core\Cache\CacheableDependencyInterface::getCacheTags(). Message entities the way that you describe them appear very ephemeral, so it makes sense to me that they would not use/need cache tags.
    2. See \Drupal\Core\Datetime\Entity\DateFormat::getCacheTagsToInvalidate() for another example.
Production build 0.71.5 2024