Cache tags / hash collision

Created on 21 October 2024, about 1 month ago

Problem/Motivation

While investigating some hard to debug cache misses on Cloudflare for a site we support I found the logic \Drupal\cloudflarepurger\EventSubscriber\CloudFlareCacheTagHeaderGenerator::cacheTagsToHashes() implements to transform Drupal cache tags to what's sent back to Cloudflare. With a few cache tags found on our site, plenty of collisions were likely to occur when using substr(md5($cache_tag), 0, 3) as the transformation method.

I then found issue https://www.drupal.org/project/cloudflare/issues/3401335 🐛 Excessive Tag Hash Collisions RTBC that changes that logic into substr(base_convert(md5($entry), 16, 36), 0, 4). However, with the same set of example cache tags some other collisions still occur, although admittedly in a lesser amount.

Steps to reproduce

  • Run the attached php script with the example list of cache tags
  • Watch collisions occur as they're reported by the script

Proposed resolution

This is up for debate. By following the description in https://developers.cloudflare.com/cache/how-to/purge-cache/purge-by-tags... there are a few key items there:
For including cache tags in the response headers

Individual tags do not have a maximum length, but the aggregate Cache-Tag HTTP header cannot exceed 16 KB after the header field name, which is approximately 1,000 unique tags. Length includes whitespace and commas but does not include the header field name.

and

A single HTTP response can have more than one Cache-Tag HTTP header field.

For later purging (API call)

For cache purges, the maximum length of a cache-tag in an API call is 1,024 characters.

Two solutions are possible here:

  1. Changing the transforming logic to use a hash function - such as md5() alone - with which the chance of collision is virtually zero, while (a) splitting the Cache-Tags response header into multiple headers per response if the maximum length is exceeded and (b) ensuring the cache purge API calls payload (request body) still work
  2. Keeping a single Cache-Tags response header per request while increasing the length to which we cut the transformed cache tags so collisions become a rare occurrence

Personally I think (1) is a better approach.

Remaining tasks

  1. Chime in and define an approach
  2. Writing patches, unit tests, etc

User interface changes

None

API changes

This would change the cache tags a site currently report for its pages and a full cache purge would be advisable / required upon deploying this to production.

Data model changes

None

🐛 Bug report
Status

Active

Version

2.0

Component

Code

Created by

🇧🇷Brazil erickbj

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @erickbj
  • 🇧🇷Brazil erickbj

    Examples of execution with the original logic and logic from #3401335 are attached based on the testing script previously shared.

  • 🇧🇷Brazil erickbj

    In fact it looks like the service parameter cloudflarepurger.cache_tag_header_limit was introduced early in the code base but is not currently used. Seems like issue https://www.drupal.org/project/cloudflare/issues/3197141 🐛 Cache-Tag header limit is incorrect Needs review proposed re-purposing that service parameter but was not implemented in favor of https://www.drupal.org/project/drupal/issues/2844620 🐛 FinishResponseSubscriber: Need warning/error when headers exceed 16k Needs work in core.

    However #2844620 seems to add an auto-incremental suffix - e.g., "-1", "-2" - to the headers and I don't think that would work with Cloudflare Cache-Tags. The suggestion in their documentation as liked above is to have multiple headers with different values, presumably with the same header name - i.e., Cache-Tags.

    Please let me know if I missed anything.

  • Pipeline finished with Failed
    about 1 month ago
    Total: 109s
    #319811
  • Pipeline finished with Failed
    about 1 month ago
    Total: 164s
    #319847
Production build 0.71.5 2024