Functional Javascript test false postive and missing browser output

Issue created by @godotislate
Comment 11 months ago →
godotislate
Comment 11 months ago →
godotislate
Comment 11 months ago →
godotislate
Comment 11 months ago →
godotislate
Comment 10 months ago →
godotislate
One thing I have noticed is that on test-only job, this message shows twice: HTML output directory sites/simpletest/browser_output is not a writable directory.

Example: https://git.drupalcode.org/issue/drupal-3493671/-/jobs/3819911
Comment 10 months ago →
🇳🇿New Zealand quietone
I have seen the same with a test-only test.

On a non-test only MR I get false positives on a FunctionalJavascript test. I modified the test in BlockAddTest to force a failure as follows,

public function testBlockAddThemeSelector(): void { $this->assertTrue(FALSE);
And GitLab shows a passing test, https://git.drupalcode.org/project/drupal/-/jobs/3836042#L486

Changing to major because of the false positives which is in turn is preventing the proof that a fix is working as expected. I think Task #3 is the one to resolve first here.
Comment 10 months ago →
godotislate
Not sure if any of these might help with the investigation:

https://www.drupal.org/node/3453468 →

🐛 browser_output is empty when testing with Drupal 11 Active
📌 Bootstrap HtmlOutputLogger from phpunit.xml RTBC
📌 Upgrade PHPUnit to 10, drop Symfony PHPUnit-bridge dependency Fixed
Comment 10 months ago →
🇩🇪Germany rkoller Nürnberg, Germany
it looks like i've ran into the same problem on https://www.drupal.org/project/drupal/issues/3485202 📌 Update to jQuery 1.14.1 and use the newly added option for dialog modal headings Active (link to the test-only changes job https://git.drupalcode.org/issue/drupal-3485202/-/jobs/3890198). per @catch in #testing where i've asked about my problem (https://drupal.slack.com/archives/C223PR743/p1735958223766589) it wouldn't hurt to have an extra example in here.

aside the browser_output i've also noticed https://git.drupalcode.org/issue/drupal-3493671/-/jobs/3819911 also contains the grep warnings grep: warning: * at start of expression.
Comment 10 months ago →
godotislate
Noting here that recent test-only runs for the MR on 🐛 Placing a block containing a form in navigation layout prevents layout builder from saving Active having been failing as expected, such as https://git.drupalcode.org/issue/drupal-3493671/-/pipelines/391171. I'm not sure what's different, so it might be an intermittent issue.
Comment 10 months ago →
🇩🇪Germany rkoller Nürnberg, Germany
hm i've rechecked my example after the comment in #10, and the testonly run is still passing there :/ https://git.drupalcode.org/issue/drupal-3485202/-/jobs/3973629 (triggered a new job)
First commit to issue fork.
Comment 9 months ago →
🇨🇭Switzerland berdir Switzerland
Re #11: Strictly speaking, the tests themself are reported as skipped, not passing, but we allow a test job to complete in case of skipped tests. My guess was that test only is maybe missing the chrome service, but that's there and the tests in #10 are shown as one fail and the others passing.

Try adjusting .gitlab-ci/scripts/test-only.sh in your MR to add --fail-on-skipped to the phpunit command on line 27, maybe then phpunit will be kind of enough to tell us why they are skipped? And/or a --debug.
Comment 9 months ago →
🇩🇪Germany rkoller Nürnberg, Germany
re #13: thanks, i've tried to follow your suggestion, not sure if the position i've placed both flags in was the correct one (i am not a developer), but it looks like your suspicion might be right, at least in regard of 📌 Update to jQuery 1.14.1 and use the newly added option for dialog modal headings Active , the test output seems to complain about the inability to connect to the webdriver instance and that the selenium host could not be resolved?

The test wasn't able to connect to your webdriver instance. For more information read core/tests/README.md.
The original message while starting Mink: Could not open connection: Could not resolve host: selenium

https://git.drupalcode.org/issue/drupal-3485202/-/jobs/4069988
Comment 9 months ago →
🇬🇧United Kingdom catch
There are several manually skipped tests in core, if we want to fail on skip we'll have to do those differently.

However the tests only job should probably definitely fail on skip?
Comment 9 months ago →
🇬🇧United Kingdom catch
📌 Move helpers in ajax_forms_test.module and delete it Needs work was properly failing in HEAD, but only breaking some pipelines.
Comment 9 months ago →
🇬🇧United Kingdom catch
Bumping this to critical because it's impossible to trust a test run at the moment.
Comment 9 months ago →
🇨🇭Switzerland berdir Switzerland
Trying to look into this.

https://git.drupalcode.org/project/drupal/-/jobs/4115231 is a job that failed on HEAD with CommandsTest.
Drupal\FunctionalJavascriptTests\Ajax\CommandsTest 0 passes 19s 1 fails

Artefacts have browser output files for this test: https://git.drupalcode.org/project/drupal/-/jobs/4115231/artifacts/brows...

In total there are 890 .html files in there.

https://git.drupalcode.org/issue/drupal-3495959/-/jobs/3887812 is a job that passed CommandsTest on the merge request:
Drupal\FunctionalJavascriptTests\Ajax\CommandsTest 1 passes 14s

No browser output files in artefacts: https://git.drupalcode.org/issue/drupal-3495959/-/jobs/3887812/artifacts...

Only around 400 .html files. It's 1/3 vs 3/3 and the distribution I guess isn't stable? But 3/3 test on this pipeline still has 600+ .html files.

The execution time to me looks very suspicious. only 14s vs 19s on the failing test.

What I find very suspicious

🇨🇭Switzerland berdir Switzerland

Downloadig the artifact files, the passing MR job has this in phpunit-45.xml:

<testsuites>
<testsuite name="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" tests="1" assertions="0" errors="0" failures="0" skipped="1" time="0.000000">
<testcase name="testAjaxCommands" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" line="29" class="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" classname="Drupal.FunctionalJavascriptTests.Ajax.CommandsTest" assertions="0" time="0.000000">
<skipped/>
</testcase>
</testsuite>
</testsuites>

Note the skipped="1" and time="0.00000".

The failing job has this in phpunit-47.xml

testsuites>
<testsuite name="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" tests="1" assertions="21" errors="0" failures="1" skipped="0" time="17.595975">
<testcase name="testAjaxCommands" file="core/tests/Drupal/FunctionalJavascriptTests/Ajax/CommandsTest.php" line="29" class="Drupal\FunctionalJavascriptTests\Ajax\CommandsTest" classname="Drupal.FunctionalJavascriptTests.Ajax.CommandsTest" assertions="21" time="17.595975">
<failure type="PHPUnit\Framework\ExpectationFailedException">Drupal\FunctionalJavascriptTests\Ajax\CommandsTest::testAjaxCommands Failed asserting that '<html lang="en" dir="ltr" class=" js"><head>\n <meta charset="utf-8">\n
...

Why does run-tests.sh think it passed and where are those 14s from?

Merge request !11003Draft: debug skipped tests → (Closed) created by berdir
Pipeline finished with Failed
9 months ago
Total: 230s
#405436
Pipeline finished with Canceled
9 months ago
Total: 103s
#405453
Pipeline finished with Failed
9 months ago
Total: 90s
#405455
Pipeline finished with Failed
9 months ago
Total: 86s
#405458
Pipeline finished with Failed
9 months ago
Total: 489s
#405459
Comment 9 months ago →
🇮🇹Italy mondrake 🇮🇹
🐛 Allow run-tests.sh to report skipped/risky/incomplete PHPUnit-based tests Needs work would probably help.
Pipeline finished with Failed
9 months ago
Total: 565s
#405477

Comment 9 months ago →

🇨🇭Switzerland berdir Switzerland

Created a merge request where I'm changing WebDriverTestBase to fail and not skip mink tests. I think that makes sense, if you want to run a JS test, then it's not skipped because something broke during setting it up, it failed?

Making that change, which took me a few attempts, now results in a ton of these errors:

https://git.drupalcode.org/issue/drupal-3494332/-/pipelines/405480/test_...

Drupal\Tests\ckeditor5\FunctionalJavascript\ImageTestProviderTest::testAltTextRequired with data set "Restricted"
Behat\Mink\Exception\DriverException: Could not open connection: Could not start a new session. New session request timed out 
Host info: host: 'runner-s8ex1x2yj-project-171984-concurrent-0-4p8yesea', ip: '10.0.86.194'
Build info: version: '4.23.1', revision: '656257d8e9'
System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.10.230-223.885.amzn2.x86_64', java.version: '17.0.12'
Driver info: driver.version: unknown

Too much concurrency maybe on those JS tests? will play a bit with adjusting that.

Pipeline finished with Failed
9 months ago
Total: 340s
#405499
Pipeline finished with Failed
9 months ago
Total: 379s
#405525
Comment 9 months ago →
🇨🇭Switzerland berdir Switzerland
lowering concurrency does reduce those fails, but by far not enough.

concurrency 5: 380 fails
concurrency 3: 180 fails
concurrency 1: still 100 fails.

I suspect there are a few of those also in the HEAD/ main project/drupal pipelines, but definitely far less. Does the main project get more resources or something? are there limits on fork projects?

Looked a bit into increasing the timeout but not sure I understand this.
Comment 9 months ago →
🇬🇧United Kingdom catch
I suspect there are a few of those also in the HEAD/ main project/drupal pipelines, but definitely far less. Does the main project get more resources or something? are there limits on fork projects?

Not that I'm aware of, the DA did discuss a separate cluster for core, but I would assume even that would include MR forks - since it's those where we really want the speed more than the on-commit/branch checks, and also that we'd know about it if it had happened.

Some of this looks similar to 🐛 Fix selenium performance/stampede issues in gitlab config and BrowserTestBase Fixed - various notes on there although it was a lot of trial and error and guesswork.

The failures on the latest pipeline don't look like timeouts but actual test failures now?
Comment 9 months ago →
🇨🇭Switzerland berdir Switzerland
I added two deliberate fails, all other are the timeout error
Comment 9 months ago →
🇬🇧United Kingdom catch
I upped the maximum sessions and increased the timeout to 30. Now I'm seeing different driver failures in addition to the timeouts, although it's possible they were already there and it's just the first one I spotted:

Drupal\Tests\ckeditor5\FunctionalJavascript\ImageTestProviderTest::testResize with data set "Image resize is disabled" Behat\Mink\Exception\DriverException: Could not close connection
I had a theory on 🐛 Fix selenium performance/stampede issues in gitlab config and BrowserTestBase Fixed that opening a browser session for each test method and closing it again might be a problem - e.g. that instead we should try to open a session per test-class and close it at the end of the test class. However that would be a lot of refactoring for something that may or may not be the cause of the problem.
Comment 9 months ago →
🇬🇧United Kingdom catch
I think this might have been introduced with 📌 Use lullabot/mink-selenium2-driver and lullabot/php-webdriver for functional browser testing Fixed and we never fully spotted it because of the skipping. I've pushed some commits to try to max-out the available connections, so far this is not resulting in less timeouts.
Pipeline finished with Failed
9 months ago
Total: 445s
#405944
Pipeline finished with Failed
9 months ago
Total: 504s
#405964
Pipeline finished with Failed
9 months ago
Total: 407s
#405969
Pipeline finished with Failed
9 months ago
Total: 316s
#405976
Pipeline finished with Failed
9 months ago
Total: 407s
#405978
Pipeline finished with Failed
9 months ago
Total: 452s
#405987
Pipeline finished with Failed
9 months ago
Total: 364s
#405992
Pipeline finished with Failed
9 months ago
Total: 467s
#405998
Pipeline finished with Failed
9 months ago
Total: 582s
#406005
Comment 9 months ago →
🇬🇧United Kingdom catch
Even at 64 parallel jobs it's not possible for every test method to successfully get a browser instance.

If we look at https://git.drupalcode.org/project/drupal/-/jobs/4136498, the 4th test method timed out, and the whole job took 6 minutes which suggests it really did time out.

I think we need to try:

I had a theory on #3463286: Fix selenium performance/stampede issues in gitlab config and BrowserTestBase that opening a browser session for each test method and closing it again might be a problem - e.g. that instead we should try to open a session per test-class and close it at the end of the test class.

e.g. each test class gets a browser session, reset but don't close the browser session in-between methods, it'll mean using a static property.
First commit to issue fork.
Comment 9 months ago →
🇬🇧United Kingdom longwave UK
CI_DEBUG_SERVICES should give us additional logging from the Selenium container which might tell us what's going on.
Merge request !11005Resolve #3494332 "Static property" → (Open) created by catch
Pipeline finished with Failed
9 months ago
Total: 941s
#406049
Pipeline finished with Canceled
9 months ago
Total: 76s
#406055
Pipeline finished with Failed
9 months ago
Total: 808s
#406056
Pipeline finished with Failed
9 months ago
Total: 531s
#406066
Comment 9 months ago →
🇬🇧United Kingdom oily Greater London
Pipeline finished with Failed
9 months ago
Total: 584s
#406072
Comment 9 months ago →
🇬🇧United Kingdom oily Greater London
#17 Not sure it's critical yet. So long as people are running the test-only test and it is failing then it must be working as there is no skipping going on. So far it seems a very small number of tests afflicted. The worst malady seems to be that people may hunt endlessly for errors in their tests. That hurts.
Pipeline finished with Failed
9 months ago
Total: 468s
#406076
Pipeline finished with Failed
9 months ago
Total: 466s
#406083
Pipeline finished with Failed
9 months ago
Total: 547s
#406089
Pipeline finished with Failed
9 months ago
Total: 640s
#406093
Comment 9 months ago →
🇨🇭Switzerland berdir Switzerland
Re #33. This is not limited to test-only jobs, it's just one minor aspect. It affects *all* test runs. We had multiple HEAD test fails because merge requests were committed that were supposed to fail. It very much is critical and it was set to critical by a core maintainer. There really isn't something to discuss about that.
Comment 9 months ago →
🇬🇧United Kingdom catch
https://git.drupalcode.org/project/drupal/-/merge_requests/11005 - this forces one-mink-session-per-test-class instead of one per method. It's not passing yet but it's failing much more predictably in that when we run out of sessions all the tests after that time out and fail.
Pipeline finished with Failed
9 months ago
Total: 644s
#406104
Comment 9 months ago →
🇬🇧United Kingdom oily Greater London
When tests skip what is the range of possible reasons for that in gitlab pipelines? It means that the test itself is terminating mid-test? Or the whole test is being skipped entirely, or could it be both?
Comment 9 months ago →
🇬🇧United Kingdom catch
So long as people are running the test-only test and it is failing then it must be working as there is no skipping going on.

The skipping happens on regular functional javascript jobs when we run out of selenium browser sessions to use. Exactly why the browser sessions aren't cleaned up / recycled (at all, or quick enough) is still a mystery, but I'm 100% sure that we're running out of them before we run out of tests to run now. Nearly have a green MR with the latest approach, apart from the intentional failures that @berdir introduced and a couple of others where the MR logic needs tidying up a bit.
Comment 9 months ago →
🇬🇧United Kingdom oily Greater London
RE: #37 I was working (I tee'd it up for him he did most the work) with #rkoller weeks ago now it was baffling us since his JS seems good. I think i did experience one previous affected issue @smustgrave was aware of and he waved through since the test was either relatively unimportant or it was clearly good. Not sure if things have grown progressively worse since then. Sounds possible from #37 but im not sure. Or maybe not many people uncovered the issue over the past few weeks because not many triggered the test-only test.. Anyway seems progress is happening.
Merge request !11006Resolve #3494332 "Desperate measures" → (Open) created by catch
Comment 9 months ago →
🇬🇧United Kingdom longwave UK
I'm finding similar reports in Selenium's issue queue and on StackOverflow but no concrete answers.

https://github.com/SeleniumHQ/docker-selenium/issues/2373
https://github.com/SeleniumHQ/selenium/issues/14457
https://stackoverflow.com/questions/65050928/org-openqa-selenium-webdriv...
https://stackoverflow.com/questions/79076438/what-is-causing-intermitten...
Comment 9 months ago →
🇬🇧United Kingdom catch
Trying XVFB based on https://github.com/SeleniumHQ/docker-selenium/issues/2373
Merge request !11007Resolve #3494332 "Try to pass" → (Closed) created by catch
Comment 9 months ago →
🇬🇧United Kingdom catch
Made a big mess of the MRs here trying out various strategies. The xvfb setting is the root cause which is a great find from @longwave buried in an unresolved github comment.

https://git.drupalcode.org/issue/drupal-3494332/-/tree/3494332-try-to-pass is my attempt to get to a passing pipeline with only the changes necessary to fail properly when a connection can't be made to selenium + changes necessary to pass. It is not quite there yet but it is close.
Comment 9 months ago →
🇬🇧United Kingdom catch
https://git.drupalcode.org/project/drupal/-/pipelines/406418 is green - I had to re-run one functional js job due to a random failure. Fixing this is likely to result in several random js failures showing up more frequently because the tests will actually run every time.
Comment 9 months ago →
🇬🇧United Kingdom catch
catch → changed the visibility of the branch 3494332-desperate-measures to hidden.
Comment 9 months ago →
🇬🇧United Kingdom catch
catch → changed the visibility of the branch 3494332-static-property to hidden.
Comment 9 months ago →
godotislate
Tremendous weekend work from @catch, and nice find from @longwave.

Will the changes in MR11007 address the similar issues in the test-only job? The configuration in the test-only job uses *with-chrome instead of *with-selenium-chrome, so the Selenium env variable changes probably don't apply?

Or should addressing the test-only job be split into another issue?
Comment 9 months ago →
🇬🇧United Kingdom catch
with-chrome is the legacy non-w3c driver setup. we have an issue to switch the performance test job to that, should check if there's an existing one for tests only to switch to it too or maybe combine those into one.

tbh I don't think those environment variables affect those jobs but am not 100% sure that they don't either.
Pipeline finished with Failed
9 months ago
Total: 492s
#407098
Pipeline finished with Failed
9 months ago
Total: 419s
#407132
Comment 9 months ago →
🇬🇧United Kingdom catch
Comment 9 months ago →
🇬🇧United Kingdom catch
Added a comment to the mink reset.

Yes same problem in gitlab templates, I opened 🐛 Set SE_START_XVFB: 'true' for selenium w3c testing Active .
Comment 9 months ago →
🇬🇧United Kingdom catch
Since the only eventual change here was in the YAML config and a comment, going to self-RTBC here.
Pipeline finished with Success
9 months ago
Total: 878s
#407233

Comment 9 months ago →

System Message

longwave → committed 03ef97f4 on 11.x

Issue #3494332 by catch, berdir, longwave, godotislate, oily, rkoller,...

Comment 9 months ago →
🇬🇧United Kingdom longwave UK
Not backported to 11.1.x because of the minor behaviour change, it should be enough to catch things in 11.x and backport any necessary bug fixes. Also not backported to 10.5.x because we are using the non-W3C container there which doesn't appear to exhibit this problem.

Committed and pushed to 11.x, thanks!
Comment 9 months ago →
System Message
quietone → closed merge request !11003
Comment 9 months ago →
System Message
quietone → closed merge request !11007
Comment 9 months ago →
System Message
Automatically closed - issue fixed for 2 weeks with no activity.

Functional Javascript test false postive and missing browser output

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Merge Requests

!11007Functional Javascript test false postive and missing browser output
Closed

!11003Functional Javascript test false postive and missing browser output
Closed

!11005Functional Javascript test false postive and missing browser output
Open

!11006Functional Javascript test false postive and missing browser output
Open

Comments & Activities

Functional Javascript test false postive and missing browser output

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

Merge Requests

!11007Functional Javascript test false postive and missing browser outputClosed

!11003Functional Javascript test false postive and missing browser outputClosed

!11005Functional Javascript test false postive and missing browser outputOpen

!11006Functional Javascript test false postive and missing browser outputOpen

Comments & Activities

!11007Functional Javascript test false postive and missing browser output
Closed

!11003Functional Javascript test false postive and missing browser output
Closed

!11005Functional Javascript test false postive and missing browser output
Open

!11006Functional Javascript test false postive and missing browser output
Open