- Issue created by @claudiu.cristea
- 🇪🇸Spain fjgarlin
Maybe you can get extra logs enabling
CI_DEBUG_SERVICES
? https://project.pages.drupalcode.org/gitlab_templates/info/common/#gathe... - 🇬🇧United Kingdom jonathan1055
The phpunit 'next minor' job currently has to redefine the services, because the regular phpunit uses 'with-chrome-legacy' whereas the next minor has 'with-chrome'
https://git.drupalcode.org/project/gitlab_templates/-/blob/main/includes...
So your custom override is only affecting the 'current' phpunit job.You can probably fix this in multiple ways, either by repeating the customisation in 'next minor', or re-using the services from your 'current' job.
- 🇪🇸Spain fjgarlin
True! It's 100% that. We are in a weird state where some versions of core 11 use an older yarn version and some others use a newer version of yarn, and they require different chromedrivers.
If this is backported we'd be able to remove that override and your code would work, but for now it's not possible, so yeah, for now I suggest following @jonathan1055 advice.
- 🇷🇴Romania claudiu.cristea Arad 🇷🇴
@jonathan1055, @fjgarlin, than you!
You can probably fix this in multiple ways, either by repeating the customisation in 'next minor', or re-using the services from your 'current' job.
I don't think I understand how repeat the same service definition under OPT_IN_TEST_NEXT_MINOR
- 🇬🇧United Kingdom jonathan1055
As it seems your tests do not need the chrome driver at all, you could try (untested)
phpunit (next minor): services: !reference [phpunit, services]
so that what ever you have in the 'current' phpunit services will be re-used in the 'next minor'. Or you could just duplicate your definition.
- 🇬🇧United Kingdom jonathan1055
Good to hear that. Just looked at your MR and I am pleased that the concise first option in #6 worked for you. It means that if you ever add another service to the phpunit job, the 'next minor' will automatically stay in sync with it.
- 🇺🇸United States cmlara
The phpunit 'next minor' job currently has to redefine the services, because the regular phpunit uses 'with-chrome-legacy' whereas the next minor has 'with-chrome
Did anyone ever test launching both CI images and just choosing which image is actually connected to by phpunit based on environment variables dynamically setting the container name? Presumably minimal cost ( higher base ram usage, and standby cpu usage for an idle container) for avoiding BC breaks until gitlab_templates creates a 2.x branch.
I did raise BC concerns in #3462681: Add W3C compliant JS testing → however I did not follow up as I thought it was a full swap not a partial swap of images between jobs.
- 🇬🇧United Kingdom jonathan1055
Did anyone ever test launching both CI images and just choosing which image is actually connected to
I don't recall ever seeing that suggestion. It is definitely worth investigating.
- 🇬🇧United Kingdom jonathan1055
Here's an initial test
https://git.drupalcode.org/project/scheduler/-/pipelines/339622PHPUnit jobs for 'current' and 'next minor' both passed. At the job start-up they both show:
[service:selenium/standalone-chrome-selenium] [service:drupalci/webdriver-chromedriver-chrome] [service:drupalci/webdriver-chromedriver-chrome] 2024-11-15T10:14:50.401430414Z ChromeDriver was started successfully.
The 'next minor' has more debug during runtime, and that shows
[service:selenium/standalone-chrome-selenium]
is being used during the tests. I'm not sure why there is no detailed services debug during the runtime tests on 'current', maybe the drivers do it differently. But the initial testing of this one javascript test shows that this is worth further investigation.Now I need to test on a project that has Nightwatch tests. Any suggestions?
- 🇪🇸Spain fjgarlin
Oh, this is a good suggestion. Starting services that we won't need is not ideal, but it's a consequence of the changes in core. The code looks much simpler and it also means we don't need to override that part for variants.
Downstream pipelines for decoupled_pages which has nightwatch tests: https://git.drupalcode.org/issue/gitlab_templates-3487525/-/pipelines/33... (you can always open an issue/MR there so you don't depend on me to run the downstream pipelines in this issue).
- 🇬🇧United Kingdom jonathan1055
Thanks for triggering the downstream pipeline. Yes I will open a draft [ignore] MR to add
CI_DEBUG_SERVICES: true
, but it looks promising as all three Nightwatch variants passed. - 🇬🇧United Kingdom jonathan1055
Tested on https://git.drupalcode.org/project/decoupled_pages/-/merge_requests/8
Pipeline https://git.drupalcode.org/issue/decoupled_pages-3422594/-/pipelines/340026
All three Nightwatch jobs pass.Current
Core Drupal: 11.0.7, Nightwatch 2.4.2 DRUPAL_TEST_WEBDRIVER_HOSTNAME='localhost' DRUPAL_TEST_WEBDRIVER_CHROME_ARGS='--disable-dev-shm-usage --disable-gpu --headless' DRUPAL_TEST_WEBDRIVER_W3C=false DRUPAL_TEST_WEBDRIVER_PORT='9515'
Previous Major
Core Drupal: 10.3.8, Nightwatch 2.4.2
Four variables same values as above.
Next Minor
Core Drupal: 11.1.0-dev, Nightwatch 3.7.0 DRUPAL_TEST_WEBDRIVER_HOSTNAME='selenium' DRUPAL_TEST_WEBDRIVER_CHROME_ARGS='--disable-dev-shm-usage --disable-gpu --headless' DRUPAL_TEST_WEBDRIVER_W3C=true DRUPAL_TEST_WEBDRIVER_PORT='4444'
I did not know (and did not even think we could) load both services, I thought there would be some clash or conflict. This is such a simple solution, and we already have the environment variables that make it work. It clashes with, but will simplify 📌 Update to new driver when D11 "current" uses Nightwatch 3 Postponed
The documentation pages for PHPUnit and Nightwatch will need updating if we are going to do this change.
- 🇪🇸Spain fjgarlin
Yeah, there is no clash because the names are different, so that made it easier. We didn't think of this because to be honest, it doesn't make too much sense to start several services knowing that you'll only pick up one, but on the other hand, having core 11.0.7 and 11.1.0-dev using a different one was something we didn't anticipate, so it's a good workaround/fix.
Also, that means that we are overriding one less part per variant, so the "inheritance/extension" will work better, not needing to re-define new sections per variant ("services" in this case) when they differ from the norm.
From my point of view, this is RTBC. I'm only curious about the performance impact, how much longer, if any, takes job creation as we are now adding one more service to the mix. It'd be great to have a before/after. I don't think that the impact will be big, but we'd need to take that into account somehow. I guess that enabling CI_DEBUG_SERVICES can give you useful info to know how long it takes to create the service.
- 🇬🇧United Kingdom jonathan1055
The tests already have
CI_DEBUG_SERVICES: 1
and there are lots of timestamps in the log. They are not all in chronological order, but I will extract the times and try to see what it shows us, if anything. - 🇬🇧United Kingdom jonathan1055
If I'm reading the logs correctly, the new selenium service takes two seconds to load. The earliest and latest timestamps, for example, are
[service:selenium/standalone-chrome-selenium] 2024-11-15T16:42:14 [service:selenium/standalone-chrome-selenium] 2024-11-15T16:42:16
The legacy driver is much quicker, mostly loading within one second
[service:drupalci/webdriver-chromedriver-chrome] 2024-11-15T16:42:14.27 [service:drupalci/webdriver-chromedriver-chrome] 2024-11-15T16:42:14.28
- 🇪🇸Spain fjgarlin
Thanks for checking.
Downstream pipelines: https://git.drupalcode.org/issue/gitlab_templates-3487525/-/pipelines/33...
The code looks good and there are tests on #12, #15 and the downstream pipelines. Setting to RTBC.
- 🇺🇸United States cmlara
Technically your not seeing all the factors in those jobs, such as does the image pull take longer due to bandwidth constraints, disk io, etc.
All that is monitoring is time between the two first container log messages which occur after the pod has started which is the most variable aspect.
Trying to performance check pod startup without raw access to the CI environment on dedicated runners where all environmental factors can be controlled is nearly pointless.
The jobs run with a minimum of 2cpu’s with (to my knowledge) burst authority to max CPU and no limitations on memory consumption. In other words extra containers have almost zero impact unless the node is near constraints from every other job pushing to limits.
To my knowledge the images are downloaded in cached mode not always-pull mode meaning other jobs running on the same runner will impact time to first log message and bandwidth usage. Image download is the biggest factor on a cold node, however even then I’ve generally seen remote servers have speed to spare and you need parallel downloads to max out the pipes if we ever get to the point we are worrying about image download times then infra has hit the point where they should have a local shared cache in the same VPC or at least same AWS region as the runners.
TLDR: don’t worry about the performance on a change like this.
Side note: realizing this won’t be a fix for everyone who has had to override containers if they use the documented method of expanding services by using the per container reference. In the future we may want to consider promoting using reference against testing-base services instead of the current documented suggestion of referencing each container by name. Still worth committing as it’s cleaner and I believe would set the base for doing such references changes (I’m working with using such a style over in quasar).
-
fjgarlin →
committed 352dda5b on main authored by
jonathan1055 →
Issue #3487525 by jonathan1055, fjgarlin, claudiu.cristea, cmlara:...
-
fjgarlin →
committed 352dda5b on main authored by
jonathan1055 →
- 🇬🇧United Kingdom jonathan1055
I just discovered that I had one extra commit locally for this MR. It was only to re-write the comments to explain what to do in MR257 on 📌 Update to new driver when D11 "current" uses Nightwatch 3 Postponed
Here's a patch so you can see what I had. Maybe this could be committed? I can open a new MR. Given that the changes may be postponed for a while, and I have already done the thinking about it and written the details, it would be good to have accurate
@todo
comments in the code ready. -
fjgarlin →
committed 1a968a0d on main
Issue #3487525: Additional comments.
-
fjgarlin →
committed 1a968a0d on main
- 🇬🇧United Kingdom jonathan1055
Thank you, that's really helpful. Sorry I missed it before.
Automatically closed - issue fixed for 2 weeks with no activity.