- Issue created by @catch
- 🇬🇧United Kingdom catch
Assuming 📌 Allow assertions on the number of database queries run during tests RTBC and 📌 Add database query spans to otel traces Needs review land before this one, we should handle both assertions and open telemetry spans here.
Implementation something like:
1. Add a decorator for all cache backends (webprofiler module has one we can borrow from) which times cache operations to the
performance_test
module that's already in core.2. Replace the database query handling for cache operations with the information we collect directly from the cache API - we can throw away the database queries since they're duplicates.
3. Add the cache operations to open telemetry (again in place of the database queries for cache operations) so they can be browsed there - cid, bin and maybe backtrace would all be useful here.
- 🇬🇧United Kingdom catch
Some of this happened in 📌 Allow assertions on the number of database queries run during tests RTBC but we should decorate the cache services (and cache tags, and key value) instead of looking at database query backtraces so it works generically and can also be separated in Open Telemetry traces.
- Merge request !6204Initial implementation for recording cache operations via a decorator. → (Open) created by catch
- Status changed to Needs work
10 months ago 5:18pm 16 January 2024 - 🇬🇧United Kingdom catch
Opened a follow-up for cache tags 📌 Collect cache tags checksum queries via a decorator Postponed . I think we should separate them out via the same mechanism that we've been doing for cache operations before that issue (i.e. checking the query caller), because a decorator can't decorate the code that actually runs the query since it's buried in the checksum implementation, we'd need to subclass rather than decorate, which is not necessarily cleaner.
MR is green now, but I had to adjust a lot of cache assertions, not sure if that's because we've fixed bugs in the tests or introduced them yet.
- 🇬🇧United Kingdom catch
but I had to adjust a lot of cache assertions, not sure if that's because we've fixed bugs in the tests or introduced them yet.
OK figured it out. Previously we were watching only database queries, now we're watching cache operations, this includes chained fast, which includes apcu hits from chained fast, which don't hit the database - so we're seeing things we couldn't previously. This also might explain some of variation in cache get recording we've seen, if it depends whether things are in apcu or not.
I think it would be a good idea to add the cache backend class to the spans, so it's more obvious which are from the database backend and which are from chained fast, but that could potentially be a follow-up. We could also detect cache hits and misses now (because we have access to the return values in the decorator), but not sure if that's useful when we already have gets vs. sets vs. deletes.
- Status changed to Needs review
10 months ago 10:38pm 16 January 2024 - 🇬🇧United Kingdom catch
Might see if I can remove the ranges from some cache assertions, not sure if that's possible or not but it might be, otherwise could use review.
- 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
I love the idea of CacheBackendDecorator. In a similar fashion, we could even create a second decorator that tracks the cache hit ratio by comparing $cids_copy and $cids. Obviously not the goal of this decorator, but a nice thought nonetheless.
I'm a bit puzzled as to why some cache counts go down. I'd expect more logging to mean all of the counts go up? #8 seems to point to more hits across the board too because we now also find cache operations that don't hit the DB, yet in testAnonymous some cache hit assertions are adjusted downwards.
Also left 2 nitpicks, leaving at NR because they're so tiny
- 🇬🇧United Kingdom catch
I love the idea of CacheBackendDecorator
It is somewhat stolen from https://www.drupal.org/project/webprofiler → although the only bit that was really re-usable more or less as-is was the factory decorator.
we could even create a second decorator that tracks the cache hit ratio by comparing $cids_copy and $cids.
I think we could probably add that to this one? Not necessarily in this issue though.
I'm a bit puzzled as to why some cache counts go down.
Yeah me too - going to OpenTelemetry-ify the standard performance test and see if I can compare before/after. There is some inherent variability to these tests because they also request css and js aggregates and image derivatives which are Drupal bootstraps until the files exist, which makes them more realistic but less repeatable..
- 🇬🇧United Kingdom catch
I'm a bit puzzled as to why some cache counts go down.
I wondered if by recording chained fast gets that might get rid of the variability altogether, because it wouldn't matter whether things are in apcu or not, or what the last clear timestamp is compared to which millisecond they were written in... and that seems to be the case! So have switched all of standard performance test over to assertSame(). Will see if it's possible to tighten the query counts next.
- 🇬🇧United Kingdom catch
This has fixed 📌 Investigate variable database query counts in performance tests Active . Bumping to major because the shift from ranges to assertSame() makes everything much more robust both for false positives and false negatives.
- 🇬🇧United Kingdom catch
Adding a screenshot of how the cache spans look in grafana tempo after this change.
- 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
Going from random ranges to exact comparisons is huge so for that alone this looks great.
As far as I am concerned, this is RTBC, but I'll hold off in case @catch wants to add more sugar to this. So feel free to RTBC yourself if you feel this is complete. I am disregarding the Nightwatch test fail as I assume that one is random, but please do double-check by running said pipeline again.
I would like to see a follow-up where we mull whether we want to compare $cids_copy to $cids to determine the cache hit ratio, but compared to the benefit of the work being done here I would regard that feature as minor or normal at best.
- 🇬🇧United Kingdom catch
Opened 📌 Separate single and multiple cache operations in performance test assertions Active and 📌 Add cache hit and miss detection to performance tests Active following a slack discussion with @kristiaanvandeneynde and @KingDutch.
I adjusted the decorator so that every method calls the respective decorated method instead of re-using the *multiple methods, this should be more robust for different cache implementations and opens up counting single and multiple operations separately. Also opened 📌 Add minimal backtrace to cache spans in performance tests Active for adding backtrace info to spans, which should be simple but this MR is getting big enough and blocks several other issues now.
- Status changed to RTBC
10 months ago 11:54am 23 January 2024 - 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium
Pipelines go green, code looks good and I agree we can tackle the other remarks in the follow-ups. I would recommend to use strict type checking in the in_array() calls.
So RTBC, but please fix the three in_array() calls upon commit.
- Status changed to Fixed
10 months ago 4:30pm 23 January 2024 -
alexpott →
committed 9c618223 on 11.x
Issue #3396196 by catch, kristiaanvandeneynde, Kingdutch: Separate cache...
-
alexpott →
committed 9c618223 on 11.x
Automatically closed - issue fixed for 2 weeks with no activity.