Automated performance testing for core

Created on 20 November 2009, about 15 years ago
Updated 6 June 2024, 8 months ago

Problem/Motivation

It would be useful to have automated performance testing for Drupal core. Manual performance testing is sometime required for issues, but it has a number of limitations.

Much like test coverage for bugs, it's easier to introduce a regression without a test than to fix it. This is because performance regressions don't always look like 'performance' issues.

Even where we do manual performance testing, it can be hard to determine what and how to test - benchmarks, xhprof, blackfire, devtools, EXPLAIN etc.. And people often struggle with producing useful performance data - i.e. ensuring that before/after comparisons are done on a site in exactly the same state for things like whether the cache is warm or not. It's also not easy to present performance data back to issues - links to blackfire go stale before issues get fixed, xhprof screenshots aren't accessible etc.

If we had some performance testing built into our CI framework, then we'd be able to see the improvements from some performance improvements, and the regressions from some performance regressions automatically. This would also provide some examples for people to apply to manual testing, or to expand coverage when new improvements are added or regressions found.

Steps to reproduce

Some recent fixed issues that introduced what should be measurable improvements or regressions. We can use these to see if performance testing shows a difference once they're reverted or not.
📌 Leverage the 'loading' html attribute to enable lazy-load by default for images in Drupal core Fixed
🐛 Stampedes and cold cache performance issues with css/js aggregation Fixed
🐛 Aggregation creates two extra aggregates when it encounters {media: screen} in a library declaration Fixed
🐛 Performance regression introduced by container serialization solution Fixed

Proposed resolution

There are broadly two types of performance tests we can do:

1. Absolute/objective/hard-coded/deterministic - write a phpunit test that ensures a certain thing (database queries, network requests) only happens a certain number of times, on a certain request.

An example of this that we already have in core is #2120457: Add test to guarantee that the Standard profile does not load any JavaScript for anonymous users on critical pages . These allow us to fail commits on regressions, but the number of things we can check like this is extremely limited - it needs to be consistent across hardware and configurations. Also as well as actual regressions, tests will need to be adjusted due to functional changes (i.e. an extra block on the Umami front page, 'vegetable of the day', could mean an extra http request for an image, but this wouldn't be a 'regression' as such, just a new UX element in Umami).

2. Relative/subjective/dynamic/non-deterministic - these are metrics which are useful, but which vary on hardware, network, what else the machine is doing (like running other phpunit tests etc.) For these, we can collect certain metrics (time to first byte, largest contentful paint, entire xhprof runs), store those metrics permanently outside the test itself, i.e. with Open Telemetry, then graph over time, compare runs, show traces from specific pages etc. This might allow us to do things like compare the runs between a known state like 10.0.0 and an MR, if we can find a way to show diffs.

Remaining tasks

Add PerformanceTestBase for allowing browser performance assertions within FunctionalJavaScriptTests Fixed adds PerformanceTestBase and allows counting of actual network requests via chromedriver.

Add OpenTelemetry Application Performance Monitoring to core performance tests Fixed send various non-deterministic data to OpenTelemetry for graphs/trends and possibly alerts.
📌 Add open-telemetry/sdk and open-telemetry/exporter-otlp as dev dependencies Active

Add more data collection for both phpunit assertions and OpenTelemetry
📌 Allow assertions on the number of database queries run during tests RTBC
📌 Add xhr and BigPipe assertions to PerformanceTestTrait Active
Needs issue: add support for database query logging - we can count number of queries by SELECT/UPDATE/INSERT/DELETE, query time etc. A possible 'absolute' test would be asserting the number of database queries executed on a warm page cache request.

Needs issue - consider adding a trait that handles instrumentation for unit/kernel/functional tests.

User interface changes

API changes

Data model changes

Release notes snippet

🌱 Plan
Status

Fixed

Version

11.0 🔥

Component
Base 

Last updated about 1 hour ago

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024