Aggregate cache operations per bin to allow for more specific asserts

Created on 19 January 2025, 2 months ago

Problem/Motivation

Per 📌 Review cache bin and cache tags of access policy caching Active , performance tests currently throw all cache get/set/delete operations in one single count per operation.

However, different bins have very different performance implications, specifically fast chained bins are way faster. Issues like 📌 Review cache bin usage of ThemeExtensionList and Theme cache context Active that move caches from regular to fast bins are not "seen" by performance tests at all.

Additionally, just having the count is a bit of a blackbox still. It's not always clear what/why there is a change. Maybe we should just store all cids as well so to simplify debugging and allow to compare/assert more specifically if so desired?

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

📌 Task
Status

Active

Version

11.0 🔥

Component

phpunit

Created by

🇨🇭Switzerland berdir Switzerland

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @berdir
  • 🇨🇭Switzerland berdir Switzerland

    Pushed a proposal, right now PerformanceData is a pure value object, I added a tiny bit of logic to sum up the counts, I also kept the existing set method calls and aggregations for BC.

    I also updated a handful of tests to show this including an additional assert on the cache gets on default. The second non-chained-fast bin with several get operations (7) is data, but I think we don't want to explicitly assert that, no idea how stable those hashes are:

    0 => 'route:[language]=en:[query_parameters]=:/en',
    1 => 'css:umami:umami:enhDWZUCRilCXGGt6daBK7eVBZ-TLhK-AMv4cB0y_qG7c1',
    2 => 'js:umami:en:u_rNnKlC9ds79tQ_PVLa3cfvZ4SZfSwy9XHF-AfxWqA11',
    3 => 'route_provider.route_load:e12969cb977b5b9f14e8b365ac441ec725a833d6586e87d7bcfa7b2c5fc214e2e814196e86f1e6d007e3e8b5703635782521189223a9fa225bcebdbdaf38d833',
    4 => 'js:umami:en:NXhscRe0440PFpI5dSznEVgmauL25KojD7u4e9aZwOM11',
    5 => 'js:umami:en:u_rNnKlC9ds79tQ_PVLa3cfvZ4SZfSwy9XHF-AfxWqA11',
    6 => 'route_provider.route_load:61d3f1526762a7ca8ed3aa9d059d2b475e3fb6a5c3f584efd6daaf7d5b942ab3268da173e50ca03f4166dbbb9e821bebb435727ee84eec58c3d67dc1811b5a39',
    
  • Pipeline finished with Canceled
    2 months ago
    Total: 741s
    #400084
  • 🇨🇭Switzerland berdir Switzerland

    There is definitely something very funky going on with those performance test assertions on CI, similar to the language caching issue.

    I accidentally did this on top of 📌 Review cache bin and cache tags of access policy caching Active . The counts are definitely, 100%, wrong. But the test didn't fail?

    Also need to rerun the other tests, they are also wrong.

  • Pipeline finished with Success
    2 months ago
    Total: 3112s
    #400088
  • Pipeline finished with Success
    2 months ago
    Total: 1025s
    #400237
  • 🇨🇭Switzerland berdir Switzerland

    Updated the tests.

    I've pulled in the performance test changes from Introduce a list of "common cache tags" to reduce lookup query amount Active and updated the counts to reflect 11.x, then we have a baseline to be able to see improvements.

  • Pipeline finished with Success
    2 months ago
    Total: 2344s
    #400292
  • 🇪🇸Spain plopesc Valladolid

    Made a quick review and code looks good to me. This is a great enhancement that will help identify scenarios where the current metrics feel somewhat opaque.

    Left a comment about how these new metrics could be checked in assertMetrics() method, but that could be discussed in a follow up if necessary, I think.

  • Pipeline finished with Success
    2 months ago
    Total: 527s
    #401120
  • 🇬🇧United Kingdom catch

    Detail like cache tags I was mostly relying on the grafana/tempo integration (via https://github.com/tag1consulting/ddev-gander) for this kind of granular information, so that it wouldn't be too overwhelming for people to write tests. However, as with database queries, once the tests are written and you can see what's going on much easier, it is all worth it and really helps to understand the impact of a change.

    Where it will not be that useful to keep track of, we can just not assert on it if we don't want to, so why not?

    Haven't reviewed the MR properly yet, but general +1 for getting more granular..

  • 🇬🇧United Kingdom catch

    Reviewed this and I can't see anything to complain about.

    I originally did not add loads of detail to what we could assert on, because I was worried about it being too cumbersome to write tests (and also just trying to get something committed to start with we can build on), but the more we use the performance tests, the more useful it is to be able to get this level of detail in the tests themselves, for some tests if not for all of them. This is only adding capabilities, leaving all the simpler assertions in place, so tests that don't need/want the level of detail can still skip it.

    Should allow us to demonstrate improvements/regressions for a wider category of issues.

  • 🇨🇭Switzerland berdir Switzerland

    berdir changed the visibility of the branch 3500683-review-cache-bin to hidden.

  • 🇨🇭Switzerland berdir Switzerland

    I'm OK getting this in as is, that said:

    * there was a bit of discussion if and how we should sort the bin information, if it should be alphabetically based on first access (like now). Any opinions?
    * I do think the existing two cache tag check counts (checksum and isValid) are not really useful/relevant anymore with this and I'd consider deprecating them because they don't reflect if this is a static cache hit or not. For example the new MR in Introduce a list of "common cache tags" to reduce lookup query amount Active actually increases the Checksum count, but that doesn't mean much. Follow-up issue for that?
    * I noticed I added the full cache tag assertion to the performance test in the navigation module, that probably doesn't make a lot of sense. Should I move that to StandardPerformanceTest (at least one example anyway) or maybe the umami tests? Umami will have a lot more due to all the entities there are checked there.

  • Pipeline finished with Failed
    2 months ago
    Total: 699s
    #402663
  • 🇨🇭Switzerland berdir Switzerland

    This has a valid fail after the rebase, so will need to update that.

  • 🇬🇧United Kingdom catch

    there was a bit of discussion if and how we should sort the bin information, if it should be alphabetically or based on first access (like now). Any opinions?

    Based on first access seems closer to reality, and if that changed I think we'd want to know? However this is not a strong opinion. When we eventually have async support and issues like 📌 Add a cache prewarm API and use it to distribute cache rebuids after cache clears / during stampedes Needs work that could mess it up, but only on very cold caches.

    I do think the existing two cache tag check counts (checksum and isValid) are not really useful/relevant anymore with this and I'd consider deprecating them because they don't reflect if this is a static cache hit or not.

    Yeah fine with deprecating those, the main thing was to take the queries out of the actual database query list for compatibility with redis et al later on.

    I noticed I added the full cache tag assertion to the performance test in the navigation module, that probably doesn't make a lot of sense. Should I move that to StandardPerformanceTest (at least one example anyway) or maybe the umami tests?

    Standard is probably a better spot, although the navigation test will probably move to standard once it's stable anyway, so I don't think it matters that much.

    I am interested in what this looks like in Umami, but don't think that should block.

  • Pipeline finished with Failed
    2 months ago
    #403194
  • 🇨🇭Switzerland berdir Switzerland

    I noticed I added the full cache tag assertion to the performance test in the navigation module, that probably doesn't make a lot of sense. Should I move that to StandardPerformanceTest (at least one example anyway) or maybe the umami tests? Umami will have a lot more due to all the entities there are checked there.

    After re-reading my own diff, turns out that I did in fact not add it to the navigation test, I did add it to StandardPerformanceTest, so all good on that front.

    When running it locally, I noticed that OpenTelemetryAuthenticatedPerformanceTest was failing with a wrong query count. I really have no idea what's going on with those performance tests on CI, something seems very wrong. And it's not the test-only issue, because that is only about that specific job AFAIK. I also seem to get some random fails locally around checksum count and cache tag is valid count.

    I also had a look at adding specific cache tags to an umami test. But there are almost no tests there yet with specific cache assertions and I didn't want to start it. The only ones I found were hot tests for anonymous and authenticated user, they only have 1-2 cache tag queries but those are then obviously huge (around 100 cache tags). Doesn't seem very useful.

    I did instead add two more blocks to StandardPerformanceTest, once for the node detail page and one for the login, there we actually assert two pages, the login and redirect and the user page. Not 100% sure how useful that is, but I think it will show up quite nicely on the other performance improvement issues. Also one more example of default bin cids and cache get counts by bin.

  • Pipeline finished with Success
    2 months ago
    Total: 2668s
    #403201
  • Pipeline finished with Failed
    2 months ago
    Total: 419s
    #404109
  • Pipeline finished with Success
    2 months ago
    Total: 534s
    #404140
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Some variable and function names might need to be a bit more clear. In general I really like where this MR is going. Going to have another review after lunch as I'm not 100% sure we need some of the countByBin methods. But will get back to that.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Okay so we are hiding some gets with the array_merge in here:

            if (in_array($operation['operation'], ['get', 'getMultiple'], TRUE)) {
              $cache_operations['get'][$operation['bin']] = array_merge($cache_operations['get'][$operation['bin']] ?? [], explode(', ', $operation['cids']));
              $cache_get_count++;
            }
    

    This leads to discrepancies between the overall cache get count and the cache get count per bin. I'm not sure I'm a fan of that. Either we rename CacheGetCountByBin to something that indicates we're looking for unique CIDs or we do not deduplicate the data.

    This tacks on to the sentiment from my previous comment that the naming in this MR isn't always accurately representing what we're gathering or counting. I'd love to see the idea of this MR go in, albeit with clearer variable and method names.

  • 🇨🇭Switzerland berdir Switzerland

    array_merge() doesn't deduplicate, see https://3v4l.org/dL7Rh. According to your counts, the per-bin numbers are higher than the total, so it's the opposite.

    The reason for that is getMultiple(). The existing count treats getMultiple([A, B, C]) as one operation, for the new count it's 3 cids.

    The difference is the explode (and merge). I can just stop doing the explode and append them all to the list. The we can still see cids of getMultiple() calls, but the would show up in the list as ['A', 'B', 'Multiple,Cids,As,String']. Then we get consistent counts?

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Whoops, so used to combining that with array_unique that I read over it. My apologies for jumping to conclusions there.

    Hmm, I'm not against splitting the CIDs up when it comes to tracking them individually. But if it's at the cost of confusing counts, then I'm not sure which one is worse.

    For the DatabaseBackend, there is a difference between ['a', 'b', 'c'] and ['a,b,c'] in a sense that the former runs 3 queries and the latter only 1. So IMO the statistics should show a difference between the two because it does have a performance impact in some backends. In that line of thinking I think it would be better if we tracked the CIDs as they were passed along and not explode them.

    It will make the CID comparisons a tiny bit messier, but they'll still be quite clear to understand.

  • 🇨🇭Switzerland berdir Switzerland

    Pushed some updates now. This definitely makes a lot more sense. I agree that it's useful to see grouped multiple lookups explicitly, I just didn't think about that properly, would have done it like this from the start otherwise.

    > It will make the CID comparisons a tiny bit messier, but they'll still be quite clear to understand.

    Only if we add explicit asserts on them, I only added very specific ones (first theme extension list, that's gone, now module extension list, which is fixed by 📌 Drupal\Core\Theme\ComponentNegotiator::negotiate uses a lot of memory Active ). The only cache bin counts to be affected by this is config, none of the other bins are currently doing any getMultiple() on the calls we test. (technically do they but only with a single argument, like entity load).

    So, for now, nothing changed except the counts on the config bin.

    I'm not interested (at the moment, anyway) to track exactly how many config objects we load, they are in ChainedFast and reasonably fast.

  • Pipeline finished with Success
    2 months ago
    Total: 393s
    #409330
  • 🇨🇭Switzerland berdir Switzerland

    Everything addressed I think except one possible method name, setting back to needs review to get feedback on that.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    All remarks have been addressed and I like the general improvements we gain. Thanks for working on this!

  • 🇬🇧United Kingdom catch

    This looks great.

    When the performance tests went in, we were only counting database queries, which included database queries via the cache API. Because what is a hit against chained fast is non-deterministic, this caused random variations.

    Then when we split database and cache collection, cache collection started to 'see' all chained fast operations, which is useful to know, but meant no differentiation. At the time I was thinking about trying to split database vs. chained fast, but only had the idea of doing it based on cache backend, which would have been really complicated to implement because we don't have a 'fast cache backend' interface or anything like that.

    Doing it by bin just didn't enter my head, but this is useful not just for database vs. chained fast, but also for seeing things like where cache sets are going etc.

    Similarly, was worried about adding assertions on cache IDs because it's a lot of detail that could potentially put people off writing performance tests - but as with database query assertions, people can just not use it if they don't want to, same as other assertions in tests.

    Will be great to see this pick up changes in other issues that were either invisible or counter-intuitive before.

    Committed/pushed to 11.x, thanks!

    • catch committed 5cd35459 on 11.x
      Issue #3500739 by berdir, kristiaanvandeneynde: Aggregate cache...
  • 🇬🇧United Kingdom catch
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024