Separate cache operations from database queries in OpenTelemetry and assertions

Created on 23 October 2023, about 1 year ago
Updated 6 February 2024, 11 months ago

Problem/Motivation

PerformanceTestBase currently relies on parsing database queries to get cache operations for assertions, and sends all database queries to OpenTelemetry. This was OK as an initial implementation but has a couple of drawbacks:

1. Because several core cache bins use chained fast, whether or not a cache get is recorded depends on apcu hits or misses. An apcu hit or miss depends on when the apcu entry was added vs the last write timestamp, and that is inherently variable because it depends on which unix second something falls into.

This turned out to be the reason for variable cache and database counts in performance tests, so this issue allows those to use assertSame() which will both prevent random test failures and also be a lot stricter when real changes occur.

2. dozens or hundreds of database queries in OpenTelemetry/Grafana Tempo is hard to scan, if we show cache operations distinctly, it's easier to look for database queries, cache gets, cache sets in the list and see what's going on.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

📌 Task
Status

Fixed

Version

11.0 🔥

Component
PHPUnit 

Last updated about 17 hours ago

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @catch
  • 🇫🇷France andypost

    Next kind of measure could be serialisation

  • 🇬🇧United Kingdom catch

    Assuming 📌 Allow assertions on the number of database queries run during tests RTBC and 📌 Add database query spans to otel traces Needs review land before this one, we should handle both assertions and open telemetry spans here.

    Implementation something like:

    1. Add a decorator for all cache backends (webprofiler module has one we can borrow from) which times cache operations to the performance_test module that's already in core.

    2. Replace the database query handling for cache operations with the information we collect directly from the cache API - we can throw away the database queries since they're duplicates.

    3. Add the cache operations to open telemetry (again in place of the database queries for cache operations) so they can be browsed there - cid, bin and maybe backtrace would all be useful here.

  • 🇬🇧United Kingdom catch

    Some of this happened in 📌 Allow assertions on the number of database queries run during tests RTBC but we should decorate the cache services (and cache tags, and key value) instead of looking at database query backtraces so it works generically and can also be separated in Open Telemetry traces.

  • Status changed to Needs work 11 months ago
  • 🇬🇧United Kingdom catch

    Not there yet but made a start.

  • 🇬🇧United Kingdom catch

    Opened a follow-up for cache tags 📌 Collect cache tags checksum queries via a decorator Postponed . I think we should separate them out via the same mechanism that we've been doing for cache operations before that issue (i.e. checking the query caller), because a decorator can't decorate the code that actually runs the query since it's buried in the checksum implementation, we'd need to subclass rather than decorate, which is not necessarily cleaner.

    MR is green now, but I had to adjust a lot of cache assertions, not sure if that's because we've fixed bugs in the tests or introduced them yet.

  • 🇬🇧United Kingdom catch

    but I had to adjust a lot of cache assertions, not sure if that's because we've fixed bugs in the tests or introduced them yet.

    OK figured it out. Previously we were watching only database queries, now we're watching cache operations, this includes chained fast, which includes apcu hits from chained fast, which don't hit the database - so we're seeing things we couldn't previously. This also might explain some of variation in cache get recording we've seen, if it depends whether things are in apcu or not.

    I think it would be a good idea to add the cache backend class to the spans, so it's more obvious which are from the database backend and which are from chained fast, but that could potentially be a follow-up. We could also detect cache hits and misses now (because we have access to the return values in the decorator), but not sure if that's useful when we already have gets vs. sets vs. deletes.

  • Status changed to Needs review 11 months ago
  • 🇬🇧United Kingdom catch

    Might see if I can remove the ranges from some cache assertions, not sure if that's possible or not but it might be, otherwise could use review.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    I love the idea of CacheBackendDecorator. In a similar fashion, we could even create a second decorator that tracks the cache hit ratio by comparing $cids_copy and $cids. Obviously not the goal of this decorator, but a nice thought nonetheless.

    I'm a bit puzzled as to why some cache counts go down. I'd expect more logging to mean all of the counts go up? #8 seems to point to more hits across the board too because we now also find cache operations that don't hit the DB, yet in testAnonymous some cache hit assertions are adjusted downwards.

    Also left 2 nitpicks, leaving at NR because they're so tiny

  • 🇬🇧United Kingdom catch

    I love the idea of CacheBackendDecorator

    It is somewhat stolen from https://www.drupal.org/project/webprofiler although the only bit that was really re-usable more or less as-is was the factory decorator.

    we could even create a second decorator that tracks the cache hit ratio by comparing $cids_copy and $cids.

    I think we could probably add that to this one? Not necessarily in this issue though.

    I'm a bit puzzled as to why some cache counts go down.

    Yeah me too - going to OpenTelemetry-ify the standard performance test and see if I can compare before/after. There is some inherent variability to these tests because they also request css and js aggregates and image derivatives which are Drupal bootstraps until the files exist, which makes them more realistic but less repeatable..

  • 🇬🇧United Kingdom catch

    I'm a bit puzzled as to why some cache counts go down.

    I wondered if by recording chained fast gets that might get rid of the variability altogether, because it wouldn't matter whether things are in apcu or not, or what the last clear timestamp is compared to which millisecond they were written in... and that seems to be the case! So have switched all of standard performance test over to assertSame(). Will see if it's possible to tighten the query counts next.

  • 🇬🇧United Kingdom catch

    This has fixed 📌 Investigate variable database query counts in performance tests Active . Bumping to major because the shift from ranges to assertSame() makes everything much more robust both for false positives and false negatives.

  • 🇬🇧United Kingdom catch
  • 🇬🇧United Kingdom catch
  • 🇬🇧United Kingdom catch
  • 🇬🇧United Kingdom catch

    Adding a screenshot of how the cache spans look in grafana tempo after this change.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Going from random ranges to exact comparisons is huge so for that alone this looks great.

    As far as I am concerned, this is RTBC, but I'll hold off in case @catch wants to add more sugar to this. So feel free to RTBC yourself if you feel this is complete. I am disregarding the Nightwatch test fail as I assume that one is random, but please do double-check by running said pipeline again.

    I would like to see a follow-up where we mull whether we want to compare $cids_copy to $cids to determine the cache hit ratio, but compared to the benefit of the work being done here I would regard that feature as minor or normal at best.

  • Pipeline finished with Failed
    11 months ago
    #81257
  • Pipeline finished with Success
    11 months ago
    #81260
  • 🇬🇧United Kingdom catch

    Opened 📌 Separate single and multiple cache operations in performance test assertions Active and 📌 Add cache hit and miss detection to performance tests Active following a slack discussion with @kristiaanvandeneynde and @KingDutch.

    I adjusted the decorator so that every method calls the respective decorated method instead of re-using the *multiple methods, this should be more robust for different cache implementations and opens up counting single and multiple operations separately. Also opened 📌 Add minimal backtrace to cache spans in performance tests Active for adding backtrace info to spans, which should be simple but this MR is getting big enough and blocks several other issues now.

  • Status changed to RTBC 11 months ago
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Pipelines go green, code looks good and I agree we can tackle the other remarks in the follow-ups. I would recommend to use strict type checking in the in_array() calls.

    So RTBC, but please fix the three in_array() calls upon commit.

  • Pipeline finished with Failed
    11 months ago
    #81309
  • Pipeline finished with Success
    11 months ago
    #81373
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    Committed 9c61822 and pushed to 11.x. Thanks!

  • Status changed to Fixed 11 months ago
    • alexpott committed 9c618223 on 11.x
      Issue #3396196 by catch, kristiaanvandeneynde, Kingdutch: Separate cache...
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024