Improve cacheTagsToHashes method and cache tag compression

Created on 9 December 2022, almost 2 years ago
Updated 31 January 2024, 10 months ago

I want to use cacheTagsToHashes function to expose cache tag for another application (nextjs btw).

I would like to do two improvements :
- Improve @return : We don't return an array of string, we return a string
- Improve method to add separator in constant : Like that, if we use method in another service, we are sure of the separator

We also need to improve code to hash to have unique hash.

✨ Feature request
Status

Fixed

Version

4.0

Component

Code

Created by

πŸ‡«πŸ‡·France arnaud-brugnon

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States EJSeguinte

    Would it be possible make the hash length customizable or at least increase it 8? I have run into an issue where the first 4 digits of the hash match another and it ends up clearing a lot more pages than intended.

  • πŸ‡«πŸ‡·France arnaud-brugnon

    What a coΓ―ncidence, i just met the same issue.

    My collegue and i are working a better solution but it s not easy.
    We can't just increase hash and hoping for the best.
    It will decrease duplicate x tag but not enough.

    Also, we can't just add a long string for x tag.
    X tag, as it s actually made, is really useful to prevent nginx to raise an exception for too big header.

    If we can post better cache tags compression, we will do.

  • πŸ‡΅πŸ‡±Poland shumer Wroclaw

    Hey guys!

    We can't just increase hash and hoping for the best.
    It will decrease duplicate x tag but not enough.

    That's correct. The X tag was initially added to avoid issues with the too big header exceptions.

    We can think of mirroring the solution from Acquia Purge, something like a

      /**
       * Create a hash with the given input and length.
       *
       * @param string $input
       *   The input string to be hashed.
       * @param int $length
       *   The length of the hash.
       *
       * @return string
       *   Cryptographic hash with the given length.
       */
      protected static function hashInput($input, $length) {
        // MD5 is the fastest algorithm beyond CRC32 (which is 30% faster, but high
        // collision risk), so this is the best bet for now. If collisions are going
        // to be a major problem in the future, we might have to consider a hash DB.
        $hex = md5($input);
        // The produced HEX can be converted to BASE32 number to take less space.
        // For example 5 characters HEX can be stored in 4 characters BASE32.
        $hash = base_convert(substr($hex, 0, ceil($length * 1.25)), 16, 32);
        // Return a hash with consistent length, padding zeroes if needed.
        return strtolower(str_pad(substr($hash, 0, $length), $length, '0', STR_PAD_LEFT));
      }
    

    and make the $length configurable. So in case of the collisions you can configure the hash length as you wish.

  • πŸ‡«πŸ‡·France arnaud-brugnon

    #8 is a great idea.

    But we have a better algorithm that's suppose to prevent any collision.
    If it work's, x tag may be a bit longer for some occasion (for complexe cache tags like 'segment1:segment2:segment3:segment4').

    But it's not the common case.

    For exemple, 'node:76' enconding should on two caracters (if i remember correct).
    One letter and on digit.

  • πŸ‡«πŸ‡·France arnaud-brugnon

    Here's some example of compression :

    Cache tag : "agency:yvelines:houilles"
    Custom compression : 1L2X2N
    #8 solution : ikaa11ev

    Cache tag : "node:76"
    Custom compression : 1c1e
    #8 solution : fgrjero9

    Cache tag : "node:67"
    Custom compression : 1c15
    #8 solution : gn4fonhc

    Cache tag : "abcd"
    Custom compression : K
    #8 solution : sbu72j27

    Cache tag : "agency:yvelines:andagainoadsglkfghalkefgjkasdfbglkasdbglkabsdlkgjasdfkjgnasdg"
    Custom compression : 1L2XgM
    #8 solution : se47ecf1

  • Status changed to Needs work about 1 year ago
  • πŸ‡«πŸ‡·France arnaud-brugnon

    Here's our full solution

    This patch must be install with #5

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.4 + Environment: PHP 5.3 & MySQL 5.5
    last update 12 months ago
    Patch Failed to Apply
  • πŸ‡«πŸ‡·France arnaud-brugnon

    Please let me know if you don't want our solution, we will move it in a custom CacheManager and override service injection (to prevent to keep so much change in a patch)

  • Status changed to Needs review 12 months ago
  • πŸ‡«πŸ‡·France arnaud-brugnon

    Small improvement to have only one patch and pass x-tag in lowercase because varnish refuse to invalidate uppercase x-tag

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.x + Environment: PHP 8.2 & MySQL 8
    last update 12 months ago
    Composer require failure
  • Status changed to Fixed 11 months ago
  • πŸ‡΅πŸ‡±Poland shumer Wroclaw
  • Automatically closed - issue fixed for 2 weeks with no activity.

  • πŸ‡«πŸ‡·France arnaud-brugnon

    I made a mistake in #15.
    I erase the addition of letters.

    With that, X-Tag value is nothing more than the value of last letter (a bit annoying).

    Sorry about that.
    Here's the fix.

    @shumer can you publish the fix asap pls ?

  • πŸ‡΅πŸ‡±Poland shumer Wroclaw

    The fun fact, we had no reports since the last release... which was a while ago :)
    Let me publish fixed version

Production build 0.71.5 2024