Add garbage collection for URL cache

Created on 12 November 2019, about 5 years ago
Updated 13 September 2023, over 1 year ago

Situation

This is a follow-up on #3032538: Hooks for modifying the cache & exclude URLs , in which I wrote:

The hooks make it possible that the list of cache URL's changes more often than used to be the case. That is okay, because the SW will try to update itself when a new version is available. However it also means that the SW cache can contain cached responses that are no longer relevant. Maybe we should introduce garbage collection as one of the activate tasks.

The code could look something like this:

    caches.open(CACHE_CURRENT).then(function (cache) {
      cache.keys().then(function (cacheKeys) {
        cacheKeys.forEach(function(request, index, array) {
          if ( /* logic to decide if this is garbage */ ) {
            cache.delete(request);
          }
        });
      });
    }),

To which @rupl replied:

[...] that is indeed a tricky situation.

Who is to say which cached URLs are no longer needed? You might have people caching other content from within the DOM and you certainly don't want to delete that.

I blogged about this some time ago: https://chrisruppel.com/blog/service-worker-offline-content/#version-man...

The takeaway is that to avoid blowing away custom-saved content, I put it in a silo away from the "app" assets. There's no way to force someone to follow this convention; only clear documentation can steer people away from using the module-managed cache. Perhaps providing helper functions for caching custom content would help, but it's also extra JS that I don't want to send unless the site admin actually uses it.

This is also related to the conversation in #3060726-9: Add support for a second service worker about different ways to influence the cached URL's. As long as we alter the list of URL's on the server side, we have a single authoritative list of URL's that should be in the cache. Once we allow adding cache URL's from the client side, there is no single list anymore, and deciding if a cache entry is garbage becomes much harder.

Proposed solution

Create a way to maintain a "single source of truth" storing the list of cacheable URL's. This could be something like DrupalSettings (variables exported from PHP to JS) combined with get/set/delete methods. Adding an item to the list should cause the URL to be added to cache. URL's added on the client side should be remembered in a cookie/localstorage so this information is not lost in a page refresh. Removing an item from the list should remove an item from cache, either immediately (if the list is modified in JS) or during a garbage cleanup action.

Feature request
Status

Postponed

Version

2.0

Component

Code

Created by

🇳🇱Netherlands marcvangend Amsterdam

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024