Add auto-refresh/quasi-LRU to database backend

Created on 14 April 2016, almost 9 years ago
Updated 15 January 2025, 21 days ago

Problem/Motivation

Follow-up from #2526150: Database cache bins allow unlimited growth: cache DB tables of gigabytes! .

Fabianx

I think if we do that, we should have also some pseudo-LRU:

If ($refresh_cache_time && $age_remaining < $refresh_cache_time) {
// cache set again. (treat like cache miss, but do not rebuild only re-cache)
}
Proposed parameters are per bin:

- max_age: 86400
- max_age_jitter_percent: 5
- refresh_cache_time: 3600 // Do not let frequently accessed data expire, re-cache it instead.

< berdir> Fabianx-screen: that would be interesting, but then we need two max-ages.. since some items might have a "hard" max-age
< Fabianx-screen> berdir: Yes, but the cache backend can distinguish with a simple flag:
< Fabianx-screen> $data->is_default_max_age => TRUE / FALSE,
report

berdir

I'm not a huge fan of introducing that complexity here. It would require storage changes in all the cache tables for one thing...

Fabianx:

- b) A mechanism for refreshing cache items that are accessed before the TTL expires is added (pseudo-LRU)

As the setting is per bin, a site owner can e.g. choose to use cache_page as Expire + Refresh to ensure larger items that are frequently accessed, stay in the cache, while smaller items that are not accessed at all would expire. (as Wim Leers wanted)

On the other hand someone that wants to cache smaller items, could just use a larger ttl for cache_render and a smaller one for cache_page.

While refreshTIme would work like:

protected function prepareItem($cache, $allow_invalid) {
  // ...

+  if ($this instanceof TtlAwareCacheBackendInterface && $cache->ttl === CacheBackendInterface::CACHE_PERMANENT && $cache->expire + $this->refreshTtl >= REQUEST_TIME) {
+   // Refresh the item in the cache, could also just update $expire.
+   $this->setMultiple([$cache]);
+ }

  return $cache;
}

I see the problem though that we have nowhere to save that the item should be cached permanently instead of an item that truly is time dependent - however I think we should save the ttl in any case in addition to the expire timestamp. That would solve that for all cache backends.

catch:

The refresh idea is a good one, and I've done similar last-minute-update logic in custom code before, but it's going to require an additional column in the database schema and an update. Also we'd have to do that in a way that it doesn't blow up before the update runs which is not going to be straightforward - the column will be missing on any ->get() call.

Proposed resolution

Add a new column storing the ttl, also need to know if an item was originally supposed to be permanent.

For permanent cache items, update the cache bin with a new expires timestamp when they're within the last segment of their current ttl - on the assumption they'll get requested again.

Remaining tasks

User interface changes

API changes

Data model changes

📌 Task
Status

Active

Version

11.0 🔥

Component

cache system

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇧United Kingdom catch

    Someone in slack was reporting problems with this - cache tables either getting huge, but setting a limit leading to constant deletes. So probably still relevant.

Production build 0.71.5 2024