Improve CI pipeline runtime

MR switches us to use run-tests.sh and makes PHPUnit tests finish in <6 mins instead of 24.
makes cypress e2e run in parallel (4 containers) meaning it finishes in 10 mins instead of 20
Total pipeline time goes from 30+mins to 13

Next step would be to build a Cypress container and avoid paying 2+ mins of setup on every job where we install the dependencies each time.

Doesn't look like we have per repo container repos so that'd be an MR to our general CI containers repo

Pipeline finished with Success

3 months ago

Total: 5122s

#480860

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

🥳🥳🥳🥳

Less than 10 LoC changes to the CI setup! The GitLab template must have improved quite a bit since I last looked at it then 🚀 This was definitely not possible when I set the CI up originally. Wonderful 😊

I don’t think we need to add @group to test base classes though? NW for just that. Trivial to fix!

Comment 3 months ago →

🇬🇧United Kingdom catch

Could probably squeeze a few more seconds off by marking the following with @group #slow, they're the three slowest tests but they don't start first.

ApiLayoutControllerPatchTest
ApiLayoutControllerPostTest
ComponentTreeItemTest

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

@catch: ❤️

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

I don’t think we need to add @group to test base classes though? NW for just that. Trivial to fix!

Apparently core does this too, didn't realize! All good then, just trying @catch's #10 now :)

Comment 3 months ago →

🇪🇸Spain fjgarlin

The GitLab template must have improved quite a bit since I last looked at it

We've been busy 🙂

Another possible improvement could be to only run one of the DBs in the matrix during MRs and make the other jobs manually triggered. I guess most aren't dealing with different database nuances. Then we can leave the main project branch and/or scheduled runs to check all databases.

There is an error in the pipeline, but I don't think it's related to the changes.

Comment 3 months ago →

🇬🇧United Kingdom catch

For core other database types are manual on MRs, we run a selection on commit, and a bigger selection in a scheduled daily job on the development branches, weekly for the release branches (as of this morning due to the same screenshot that revived this issue). Very occasionally someone does something database related, everyone forgets to run the manual jobs, we commit a regression and have to revert, but the on-commit job catches those immediately unless it's also ignored for days, which I can only remember happening once in the past few years.

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

OMG!

Drupal\Tests\experience_builder\Kernel\EcosystemSupport\Fiel   1 passes    4s                                      
Drupal\Tests\experience_builder\Kernel\Plugin\ExperienceBuil  21 passes   94s                                      
Drupal\Tests\experience_builder\Kernel\Config\ContentTemplat   3 passes   13s

— https://git.drupalcode.org/project/experience_builder/-/jobs/5053302

Apparently run-tests.sh now lists how long the test took!!! 🥳

Took @catch's advice in #10. The 3 he highlighted take ~200 seconds. But tests are running with --concurrency 32. Played 2 minutes in TextMate and I now have a sorted list. I'll mark the slowest 21 as "slow": those are the ones that take >=100 seconds. That leaves room to spot new slow ones later and mark them slow too. Imagine the entire PHPUnit test suite running in ~100 seconds!

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

#13 + #14: I've been wanting to do something like that for months!

We introduced the multi-DB testing in 📌 PHPUnit SQLite CI job Needs review , for 📌 Prevent modules from being uninstalled if they provide field types used in an Experience Builder field Fixed , which uses the DB-specific JSON_EXTRACT stuff. But if everything works out as we hope in 📌 Calculate field and component dependencies on save and store them in an easy to retrieve format Active , we could just drop that altogether. 🤞

I didn't expect this issue to appear overnight though, so will spend a few minutes trying to get that working 🤓

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

@larowlan in #8:

MR switches us to use run-tests.sh and makes PHPUnit tests finish in <6 mins instead of 24.

to be precise:

Test run duration: 3 min 45 sec

(total CI job duration: 5 minutes 16 seconds)
— https://git.drupalcode.org/project/experience_builder/-/jobs/5053302

@catch's #10 + my #15
👇

Test run duration: 2 min 48 sec

(total CI job duration: 5 minutes 13 seconds)
— https://git.drupalcode.org/project/experience_builder/-/jobs/5055493 (at a time where there was still not full load on CI infra)

Shaved off another minute, the rest is just CI infra overhead AFAIK.

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Turns out we were using _PHPUNIT_CONCURRENT: 1 until 📌 Unit tests for PropExpressions to go to/from string representation + single StructuredDataPropExpression::from(…) method Fixed !

Specifically, I removed it because it was a work-around for 📌 Prepare for PHPUnit 10 Fixed .

Back then (10 months ago!) it didn't make a material difference: the test suite was much smaller.

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Interestingly, the MySQL-powered PHPUnit CI job was notably faster:

Not sure why that is. Rerunning both to see if it's reproducible. I'll make the default whichever of the 2 is fastest for @fjgarlin & @catch's proposal in #13 + #14.

Comment 3 months ago →

🇪🇸Spain fjgarlin

Yeah, on a bigger set of tests it makes a big difference to run tests with run-tests.sh. Amazing progress so far!!

Pipeline finished with Failed

3 months ago

Total: 1169s

#481051

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

makes cypress e2e run in parallel (4 containers) meaning it finishes in 10 mins instead of 20

This is great, but I just saw global-region.cy.js crash hard:

We detected that the Electron Renderer process just crashed.
We have failed the current spec but will continue running the next spec.
This can happen for a number of different reasons.
…

— https://git.drupalcode.org/project/experience_builder/-/jobs/5055503

I've not seen that happen on that particular test … ever? 🤔

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Yay, seems like the AI assistance to achieve #13 + #14 worked — I bet there's a nicer way to achieve this in GitLab CI, but this will do for here.

https://www.drupal.org/project/gitlab_templates → will hopefully some day lead the way and make this simple for every contrib module :)

Also, #21 didn't occur again, so I see only reasons to land this right now, to make XB development a bit smoother! 🚀

Pipeline finished with Skipped

3 months ago

#481114

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Comment 3 months ago →

System Message

wim leers → committed 715f26b0 on 0.x authored by larowlan →

Issue #3509614 by larowlan, wim leers, catch, fjgarlin: Improve CI...

Pipeline finished with Failed

3 months ago

Total: 637s

#481108

Comment 3 months ago →

🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

Next up: ✨ Would you accept a Cypress container Active . @fjgarlin, any thoughts on that? 😇

Comment 3 months ago →

🇪🇸Spain fjgarlin

I made a comment on that other issue. That'll defo shave more minutes.

Comment 3 months ago →

🇪🇸Spain fjgarlin

Re your last commit on this issue's MR, you could have maybe played with the variable CI_PIPELINE_SOURCE, so it does not trigger on MR but it does on push to the main branch. You could have also set the "when: manual" so the jobs can be triggered manually.

Comment 3 months ago →

🇬🇧United Kingdom catch

EDIT: The difference in #19 is not consistent. The second runs both took ~3 minutes. Presumably varies with CI infra load.

IME the gitlab CI runners alternate between at least two instance types, and one is about 1/3rd faster than the other, guessing CPU speed. This is mostly conjecture. It can also happen that sometimes a job gets a machine to itself, and sometimes it's sharing it with other heavy tests - e.g. 32 CPU is more than 32 CPU if there are more CPUs available.

I'm slightly surprised that this needed so many tests to be marked with @group #slow, but possibly 📌 Order tests by number of public methods to optimize gitlab job times Fixed and 📌 Include a check for data providers when ordering by method count Active only works with parallel > 1, would need to look again. Ideally that would 'just work' for contrib, 📌 Deprecate TestDiscovery test file scanning, use PHPUnit API instead Active will clean up some of the hacks but we might want to check what's going on when there's no CI parallelism to make sure the logic still runs.