Add a performance tests job to gitlab and send data to OpenTelemetry

Created on 4 October 2023, over 1 year ago
Updated 3 November 2023, about 1 year ago

Problem/Motivation

Now that Add OpenTelemetry Application Performance Monitoring to core performance tests Fixed is in core, and there is a public OpenTelemetry/Grafana install at http://grafana.tag1demo.tag1.io/d/teMVIdjVz/umami?orgId=1&refresh=30s we'll be able to run just the OpenTelemetry group tests with that endpoint configured to feed traces into it - giving us a performance dashboard that's updated every hour.

Steps to reproduce

See data appear on http://grafana.tag1demo.tag1.io/d/teMVIdjVz/umami?orgId=1&refresh=30s

This is fed by a pipeline schedule on a Drupal core fork: https://www.drupal.org/project/core_performance_testbed

Proposed resolution

1. Add a DAILY_TESTvariable to the daily schedule, and change the daily jobs to run only when that is set (done on gitlab courtesy of @longwave).

2. Add a new 'Performance tests' job that only runs when PERFORMANCE_TEST is set - this will be set on a different hourly pipeline schedule so we get regular and predictable data fed into grafana.

3. Couple of small bugfixes to PerformanceTestTrait - use getenv() instead of $_ENV and make sure we look for firstContentfulPaint before we stop collecting log entries.

If we notice wide variations in the AWS spot instances, we might want to move the test runs to slightly more dedicated hosting somehow, but would be good to see everything working end-to-end asap and then adjust from there. Pretty sure this could be done by adding a dedicated/hard-coded runner to gitlab and then always sending performance tests to that, although will require infra support.

The reason to run hourly rather than on-commit is twofold:

1. An hourly run means there will be data to look at even if there hasn't been a commit in 24 hours or so.

2. Conversely, if five commits are pushed within ten minutes, then either the commits would interrupt each other, leading to incomplete test runs then one final test run, or if we let them all run, the points on the graph could overlap from different test runs completing different tests.

Hourly runs smooth both of these things out. 📌 Add the commit hash to OpenTelemetry traces Active will show us which commit hash produced which trace (and hopefully which point on the graph too).

Remaining tasks

To test the MR, we have to allow manual runs and hardcode the OTEL_COLLECTOR environment variable in the YAML, when this is RTBC, or immediately following an initial commit, both of these can be removed from the job YAML and configured in the gitlab UI instead.

User interface changes

API changes

Data model changes

Release notes snippet

📌 Task
Status

Fixed

Version

11.0 🔥

Component
PHPUnit 

Last updated about 4 hours ago

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024