- Issue created by @godotislate
One thing I have noticed is that on test-only job, this message shows twice:
HTML output directory sites/simpletest/browser_output is not a writable directory.
Example: https://git.drupalcode.org/issue/drupal-3493671/-/jobs/3819911
- 🇳🇿New Zealand quietone
I have seen the same with a test-only test.
On a non-test only MR I get false positives on a FunctionalJavascript test. I modified the test in BlockAddTest to force a failure as follows,
public function testBlockAddThemeSelector(): void { $this->assertTrue(FALSE);
And GitLab shows a passing test, https://git.drupalcode.org/project/drupal/-/jobs/3836042#L486
Changing to major because of the false positives which is in turn is preventing the proof that a fix is working as expected. I think Task #3 is the one to resolve first here.
Not sure if any of these might help with the investigation:
https://www.drupal.org/node/3453468 →
🐛 browser_output is empty when testing with Drupal 11 Active
📌 Bootstrap HtmlOutputLogger from phpunit.xml RTBC
📌 Upgrade PHPUnit to 10, drop Symfony PHPUnit-bridge dependency Fixed- 🇩🇪Germany rkoller Nürnberg, Germany
it looks like i've ran into the same problem on https://www.drupal.org/project/drupal/issues/3485202 📌 Update to jQuery 1.14.1 and use the newly added option for dialog modal headings Active (link to the test-only changes job https://git.drupalcode.org/issue/drupal-3485202/-/jobs/3890198). per @catch in #testing where i've asked about my problem (https://drupal.slack.com/archives/C223PR743/p1735958223766589) it wouldn't hurt to have an extra example in here.
aside the browser_output i've also noticed https://git.drupalcode.org/issue/drupal-3493671/-/jobs/3819911 also contains the grep warnings
grep: warning: * at start of expression
. Noting here that recent test-only runs for the MR on 🐛 Placing a block containing a form in navigation layout prevents layout builder from saving Active having been failing as expected, such as https://git.drupalcode.org/issue/drupal-3493671/-/pipelines/391171. I'm not sure what's different, so it might be an intermittent issue.
- 🇩🇪Germany rkoller Nürnberg, Germany
hm i've rechecked my example after the comment in #10, and the testonly run is still passing there :/ https://git.drupalcode.org/issue/drupal-3485202/-/jobs/3973629 (triggered a new job)
- First commit to issue fork.
- 🇨🇭Switzerland berdir Switzerland
Re #11: Strictly speaking, the tests themself are reported as skipped, not passing, but we allow a test job to complete in case of skipped tests. My guess was that test only is maybe missing the chrome service, but that's there and the tests in #10 are shown as one fail and the others passing.
Try adjusting .gitlab-ci/scripts/test-only.sh in your MR to add --fail-on-skipped to the phpunit command on line 27, maybe then phpunit will be kind of enough to tell us why they are skipped? And/or a --debug.
- 🇩🇪Germany rkoller Nürnberg, Germany
re #13: thanks, i've tried to follow your suggestion, not sure if the position i've placed both flags in was the correct one (i am not a developer), but it looks like your suspicion might be right, at least in regard of 📌 Update to jQuery 1.14.1 and use the newly added option for dialog modal headings Active , the test output seems to complain about the inability to connect to the webdriver instance and that the selenium host could not be resolved?
The test wasn't able to connect to your webdriver instance. For more information read core/tests/README.md.
The original message while starting Mink: Could not open connection: Could not resolve host: seleniumhttps://git.drupalcode.org/issue/drupal-3485202/-/jobs/4069988
- 🇬🇧United Kingdom catch
There are several manually skipped tests in core, if we want to fail on skip we'll have to do those differently.
However the tests only job should probably definitely fail on skip?
- 🇬🇧United Kingdom catch
📌 Move helpers in ajax_forms_test.module and delete it Needs work was properly failing in HEAD, but only breaking some pipelines.
- 🇬🇧United Kingdom catch
Bumping this to critical because it's impossible to trust a test run at the moment.
- 🇨🇭Switzerland berdir Switzerland
Trying to look into this.
https://git.drupalcode.org/project/drupal/-/jobs/4115231 is a job that failed on HEAD with CommandsTest.
Drupal\FunctionalJavascriptTests\Ajax\CommandsTest 0 passes 19s 1 failsArtefacts have browser output files for this test: https://git.drupalcode.org/project/drupal/-/jobs/4115231/artifacts/brows...
In total there are 890 .html files in there.
https://git.drupalcode.org/issue/drupal-3495959/-/jobs/3887812 is a job that passed CommandsTest on the merge request:
Drupal\FunctionalJavascriptTests\Ajax\CommandsTest 1 passes 14sNo browser output files in artefacts: https://git.drupalcode.org/issue/drupal-3495959/-/jobs/3887812/artifacts...
Only around 400 .html files. It's 1/3 vs 3/3 and the distribution I guess isn't stable? But 3/3 test on this pipeline still has 600+ .html files.
The execution time to me looks very suspicious. only 14s vs 19s on the failing test.
What I find very suspicious
- 🇨🇭Switzerland berdir Switzerland
Downloadig the artifact files, the passing MR job has this in phpunit-45.xml:
<testsuites> <testsuite name="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" tests="1" assertions="0" errors="0" failures="0" skipped="1" time="0.000000"> <testcase name="testAjaxCommands" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" line="29" class="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" classname="Drupal.FunctionalJavascriptTests.Ajax.CommandsTest" assertions="0" time="0.000000"> <skipped/> </testcase> </testsuite> </testsuites>
Note the skipped="1" and time="0.00000".
The failing job has this in phpunit-47.xml
testsuites> <testsuite name="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" tests="1" assertions="21" errors="0" failures="1" skipped="0" time="17.595975"> <testcase name="testAjaxCommands" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" line="29" class="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" classname="Drupal.FunctionalJavascriptTests.Ajax.CommandsTest" assertions="21" time="17.595975"> <failure type="PHPUnit\Framework\ExpectationFailedException">Drupal\FunctionalJavascriptTests\Ajax\CommandsTest::testAjaxCommands Failed asserting that '<html lang="en" dir="ltr" class=" js"><head>\n <meta charset="utf-8">\n ...
Why does run-tests.sh think it passed and where are those 14s from?
- 🇮🇹Italy mondrake 🇮🇹
🐛 Allow run-tests.sh to report skipped/risky/incomplete PHPUnit-based tests Needs work would probably help.
- 🇨🇭Switzerland berdir Switzerland
Created a merge request where I'm changing WebDriverTestBase to fail and not skip mink tests. I think that makes sense, if you want to run a JS test, then it's not skipped because something broke during setting it up, it failed?
Making that change, which took me a few attempts, now results in a ton of these errors:
https://git.drupalcode.org/issue/drupal-3494332/-/pipelines/405480/test_...
Drupal\Tests\ckeditor5\FunctionalJavascript\ImageTestProviderTest::testAltTextRequired with data set "Restricted" Behat\Mink\Exception\DriverException: Could not open connection: Could not start a new session. New session request timed out Host info: host: 'runner-s8ex1x2yj-project-171984-concurrent-0-4p8yesea', ip: '10.0.86.194' Build info: version: '4.23.1', revision: '656257d8e9' System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.10.230-223.885.amzn2.x86_64', java.version: '17.0.12' Driver info: driver.version: unknown
Too much concurrency maybe on those JS tests? will play a bit with adjusting that.
- 🇨🇭Switzerland berdir Switzerland
lowering concurrency does reduce those fails, but by far not enough.
concurrency 5: 380 fails
concurrency 3: 180 fails
concurrency 1: still 100 fails.I suspect there are a few of those also in the HEAD/ main project/drupal pipelines, but definitely far less. Does the main project get more resources or something? are there limits on fork projects?
Looked a bit into increasing the timeout but not sure I understand this.
- 🇬🇧United Kingdom catch
I suspect there are a few of those also in the HEAD/ main project/drupal pipelines, but definitely far less. Does the main project get more resources or something? are there limits on fork projects?
Not that I'm aware of, the DA did discuss a separate cluster for core, but I would assume even that would include MR forks - since it's those where we really want the speed more than the on-commit/branch checks, and also that we'd know about it if it had happened.
Some of this looks similar to 🐛 Fix selenium performance/stampede issues in gitlab config and BrowserTestBase Fixed - various notes on there although it was a lot of trial and error and guesswork.
The failures on the latest pipeline don't look like timeouts but actual test failures now?
- 🇨🇭Switzerland berdir Switzerland
I added two deliberate fails, all other are the timeout error
- 🇬🇧United Kingdom catch
I upped the maximum sessions and increased the timeout to 30. Now I'm seeing different driver failures in addition to the timeouts, although it's possible they were already there and it's just the first one I spotted:
Drupal\Tests\ckeditor5\FunctionalJavascript\ImageTestProviderTest::testResize with data set "Image resize is disabled" Behat\Mink\Exception\DriverException: Could not close connection
I had a theory on 🐛 Fix selenium performance/stampede issues in gitlab config and BrowserTestBase Fixed that opening a browser session for each test method and closing it again might be a problem - e.g. that instead we should try to open a session per test-class and close it at the end of the test class. However that would be a lot of refactoring for something that may or may not be the cause of the problem.
- 🇬🇧United Kingdom catch
I think this might have been introduced with 📌 Use lullabot/mink-selenium2-driver and lullabot/php-webdriver for functional browser testing Fixed and we never fully spotted it because of the skipping. I've pushed some commits to try to max-out the available connections, so far this is not resulting in less timeouts.
- 🇬🇧United Kingdom catch
Even at 64 parallel jobs it's not possible for every test method to successfully get a browser instance.
If we look at https://git.drupalcode.org/project/drupal/-/jobs/4136498, the 4th test method timed out, and the whole job took 6 minutes which suggests it really did time out.
I think we need to try:
I had a theory on #3463286: Fix selenium performance/stampede issues in gitlab config and BrowserTestBase that opening a browser session for each test method and closing it again might be a problem - e.g. that instead we should try to open a session per test-class and close it at the end of the test class.
e.g. each test class gets a browser session, reset but don't close the browser session in-between methods, it'll mean using a static property.
- First commit to issue fork.
- 🇬🇧United Kingdom longwave UK
CI_DEBUG_SERVICES should give us additional logging from the Selenium container which might tell us what's going on.
- 🇬🇧United Kingdom oily Greater London
#17 Not sure it's critical yet. So long as people are running the test-only test and it is failing then it must be working as there is no skipping going on. So far it seems a very small number of tests afflicted. The worst malady seems to be that people may hunt endlessly for errors in their tests. That hurts.
- 🇨🇭Switzerland berdir Switzerland
Re #33. This is not limited to test-only jobs, it's just one minor aspect. It affects *all* test runs. We had multiple HEAD test fails because merge requests were committed that were supposed to fail. It very much is critical and it was set to critical by a core maintainer. There really isn't something to discuss about that.
- 🇬🇧United Kingdom catch
https://git.drupalcode.org/project/drupal/-/merge_requests/11005 - this forces one-mink-session-per-test-class instead of one per method. It's not passing yet but it's failing much more predictably in that when we run out of sessions all the tests after that time out and fail.
- 🇬🇧United Kingdom oily Greater London
When tests skip what is the range of possible reasons for that in gitlab pipelines? It means that the test itself is terminating mid-test? Or the whole test is being skipped entirely, or could it be both?
- 🇬🇧United Kingdom catch
So long as people are running the test-only test and it is failing then it must be working as there is no skipping going on.
The skipping happens on regular functional javascript jobs when we run out of selenium browser sessions to use. Exactly why the browser sessions aren't cleaned up / recycled (at all, or quick enough) is still a mystery, but I'm 100% sure that we're running out of them before we run out of tests to run now. Nearly have a green MR with the latest approach, apart from the intentional failures that @berdir introduced and a couple of others where the MR logic needs tidying up a bit.
- 🇬🇧United Kingdom oily Greater London
RE: #37 I was working (I tee'd it up for him he did most the work) with #rkoller weeks ago now it was baffling us since his JS seems good. I think i did experience one previous affected issue @smustgrave was aware of and he waved through since the test was either relatively unimportant or it was clearly good. Not sure if things have grown progressively worse since then. Sounds possible from #37 but im not sure. Or maybe not many people uncovered the issue over the past few weeks because not many triggered the test-only test.. Anyway seems progress is happening.
- 🇬🇧United Kingdom longwave UK
I'm finding similar reports in Selenium's issue queue and on StackOverflow but no concrete answers.
https://github.com/SeleniumHQ/docker-selenium/issues/2373
https://github.com/SeleniumHQ/selenium/issues/14457
https://stackoverflow.com/questions/65050928/org-openqa-selenium-webdriv...
https://stackoverflow.com/questions/79076438/what-is-causing-intermitten... - 🇬🇧United Kingdom catch
Trying XVFB based on https://github.com/SeleniumHQ/docker-selenium/issues/2373
- 🇬🇧United Kingdom catch
Made a big mess of the MRs here trying out various strategies. The xvfb setting is the root cause which is a great find from @longwave buried in an unresolved github comment.
https://git.drupalcode.org/issue/drupal-3494332/-/tree/3494332-try-to-pass is my attempt to get to a passing pipeline with only the changes necessary to fail properly when a connection can't be made to selenium + changes necessary to pass. It is not quite there yet but it is close.
- 🇬🇧United Kingdom catch
https://git.drupalcode.org/project/drupal/-/pipelines/406418 is green - I had to re-run one functional js job due to a random failure. Fixing this is likely to result in several random js failures showing up more frequently because the tests will actually run every time.
Tremendous weekend work from @catch, and nice find from @longwave.
Will the changes in MR11007 address the similar issues in the test-only job? The configuration in the test-only job uses
*with-chrome
instead of*with-selenium-chrome
, so the Selenium env variable changes probably don't apply?Or should addressing the test-only job be split into another issue?
- 🇬🇧United Kingdom catch
with-chrome is the legacy non-w3c driver setup. we have an issue to switch the performance test job to that, should check if there's an existing one for tests only to switch to it too or maybe combine those into one.
tbh I don't think those environment variables affect those jobs but am not 100% sure that they don't either.
- 🇬🇧United Kingdom catch
Added a comment to the mink reset.
Yes same problem in gitlab templates, I opened 🐛 Set SE_START_XVFB: 'true' for selenium w3c testing Active .
- 🇬🇧United Kingdom catch
Since the only eventual change here was in the YAML config and a comment, going to self-RTBC here.
-
longwave →
committed 03ef97f4 on 11.x
Issue #3494332 by catch, berdir, longwave, godotislate, oily, rkoller,...
-
longwave →
committed 03ef97f4 on 11.x
- 🇬🇧United Kingdom longwave UK
Not backported to 11.1.x because of the minor behaviour change, it should be enough to catch things in 11.x and backport any necessary bug fixes. Also not backported to 10.5.x because we are using the non-W3C container there which doesn't appear to exhibit this problem.
Committed and pushed to 11.x, thanks!
Automatically closed - issue fixed for 2 weeks with no activity.