smustgrave → credited mstrelan → .
Can you explain a bit more about where this path appending thing happens? Is it something we can maybe fix?
I'm offline for over a week now but when from what I recall you can see it in the html output for ExperimentalHelpTest if you remove the slash and run the test. Didn't look at what was causing it.
Why would the block report as being broken? What's causing it to be broken? This test needs to explain what it is doing!
IIRC there is a test module that includes config for a block, but that block is referencing the uuid of a block content entity that doesn't exist. Couldn't tell you why, but it's not really relevant. I think let's ignore that test for now.
Added 2 more, HelpTest and HelpTopicTest, both of this are much faster now but I don't have the numbers. I did have a go at HelpTopicsSyntaxTest but it was actually significantly slower!
@dww sure, I've created an MR against 11.x. As a bonus you don't have to review it one commit at a time because all the changes are limited to the help module, other than a7f43433.
Added a comment in #3517299-4: [PP-1] Convert functional tests in help module to kernel tests → that the only thing blocking multiple requests in the tests I've looked at is the missing preceding slash.
UiHelperTrait has a buildUrl method that ensures the path is absolute, I think we need to do the same thing here.
I investigated the multiple requests issue a little further, and it turns out the only issue is that the paths weren't prefixed with a preceding slash, so the path was appended to the path of the previous request resulting in 404s. After fixing those the updated timing is below.
MySQL
ExperimentalHelpTest - Kernel 3.246s, Functional 13.550s
HelpBlockTest - Kernel 2.818s, Functional 13.642s
HelpPageOrderTest - Kernel 3.216s, Functional 14.067s
HelpPageReverseOrderTest - Kernel 3.212s, Functional 13.407s
NoHelpTest - Kernel 3.831s, Functional 15.771s
Total: Kernel 16.323s, Functional 70.437s
MySQL w/ tmpfs
ExperimentalHelpTest - Kernel 1.739s, Functional 6.374s
HelpBlockTest - Kernel 1.634s, Functional 5.254s
HelpPageOrderTest - Kernel 1.686s, Functional 5.254s
HelpPageReverseOrderTest - Kernel 1.700s, Functional 5.404s
NoHelpTest - Kernel 1.754s, Functional 5.754s
Total: Kernel 8.513s, Functional 28.04s
Those numbers make this much more compelling.
Opened 📌 [PP-1] Convert functional tests in help module to kernel tests Postponed with a few more help tests I converted, see https://git.drupalcode.org/issue/drupal-3517299/-/merge_requests/1/diffs for the diff.
Added an MR against 3390193-add-drupalget-with-mink-to-kerneltestbase
but since it's against that branch it doesn't appear on the issue page, and the pipelines don't work. But I did it this way so we can see only the changes to Help module and not all the changes in the parent issue.
https://git.drupalcode.org/issue/drupal-3517299/-/merge_requests/1
It was a little annoying having to deal with the fact that we can't perform multiple requests in the same test, but overall it's still faster for the cases that I split. My test times below:
MySQL
ExperimentalHelpTest - Kernel 6.348s, Functional 13.550s
HelpBlockTest - Kernel 8.376s, Functional 13.642s
HelpPageOrderTest - Kernel 3.216s, Functional 14.067s
HelpPageReverseOrderTest - Kernel 3.212s, Functional 13.407s
NoHelpTest - Kernel 7.468s, Functional 15.771s
Total: Kernel 28.62s, Functional 70.437s
MySQL w/ tmpfs
ExperimentalHelpTest - Kernel 3.396s, Functional 6.374s
HelpBlockTest - Kernel 4.992s, Functional 5.254s
HelpPageOrderTest - Kernel 1.686s, Functional 5.254s
HelpPageReverseOrderTest - Kernel 1.700s, Functional 5.404s
NoHelpTest - Kernel 3.574s, Functional 5.754s
Total: Kernel 15.348, Functional 28.04s
So for MySQL with a normal filesystem this is a massive improvement! For tmpfs it's a decent improvement for the tests that only had one drupalGet, but it's much less of an improvement when the test had to be split and we repeat the setup tasks. It might be worth waiting for 📌 Allows DrupalKernel::handle to handle more than one requests (rough support for PHP-PM) Needs work for those tests.
mstrelan → changed the visibility of the branch 3390193-add-drupalget-with-mink-to-kerneltestbase to hidden.
mstrelan → created an issue.
I'm not sure whether to:
a. call AssertContentTrait::setRawContent in drupalGet()
b. change getTextContent() to use Mink, since that's now available? And then we can later on deprecate setRawContent().
I'm not sure either. I don't think we can deprecate setRawContent
because I think some tests use it after directly rendering things not using drupalGet
. Note that UiHelperTrait::drupalGet
has this comment that I think is not true either, because setRawContent
only happens in KernelTestBase.
* @return string
* The retrieved HTML string, also available as $this->getRawContent()
@mstrelan could you try adding this to the top of AssertContentTrait::getTextContent()?
Yeah that worked. I think probably the simplest solution is to call $this->content = $out;
from the new drupalGet, that also makes it work.
Could you push your work on that test somewhere so I can have a look at it?
Added patches for POC I did on block_content and help tests.
The docs on HttpKernelUiHelperTrait::drupalGet() say that page caching modules won't work. Should that list of warnings be moved from the method docs to the trait docs, so they're more visible?
I must admit I didn't look at the docs, I just expected it would work like BrowserTestBase. I don't think moving them would make a difference. I do think we could probably work around this though by either replacing that policy or modifying it to use a service that can be replaced or decorated to pretend it's not the CLI. But this is best suited to a follow up.
The 50x run has passed 3 times in a row now:
https://git.drupalcode.org/issue/drupal-3497701/-/jobs/4854823
https://git.drupalcode.org/issue/drupal-3497701/-/jobs/4856362
https://git.drupalcode.org/issue/drupal-3497701/-/jobs/4864839
Postponing on https://github.com/php-tuf/composer-stager/pull/414 but we could also focus on speeding up the test instead of increasing the timeout.
I tried to convert a few BrowserTestBase tests to kernel tests to try this out and have a couple observations.
1. \Drupal\Tests\help\Functional\HelpPageOrderTest
This calls $this->getTextContent()
which in BrowserTestBase resolves to \Drupal\Tests\UiHelperTrait::getTextContent
, but in KernelTestBase resolves to \Drupal\KernelTests\AssertContentTrait::getTextContent
which depends on AssertContentTrait::setRawContent
being called. Of course the test could be refactored to not call $this->getTextContent()
, but I'm wondering if we should do something about that here, such as calling AssertContentTrait::setRawContent
in drupalGet
. Once I worked around it the Kernel test passed in 3 seconds as opposed to 12 seconds before.
2. \Drupal\Tests\block_content\Functional\BlockContentPageViewTest
This required lots of refactoring, which is as expected, but I couldn't get the foobar_gorilla
block to load in ::testPageEdit
. I was able to manually place the system_powered_by_block
and get that to load, so I'm willing to concede that there's probably something I'm missing, rather than blocks not appearing because of the new drupalGet.
3. \Drupal\Tests\dynamic_page_cache\Functional\DynamicPageCacheIntegrationTest
This was probably a bit ambitious. The first blocker I couldn't easily work around was that the default cache policy for DPC prevents caching if PHP_SAPI is cli, which it is now that we're in a Kernel test. This might lead to other gotchas down the track if folks are expecting DPC to work when using drupalGet in Kernel tests.
That's as much as I've got time for today. Unfortunately I didn't find an existing test that could be converted cleanly, but nonetheless I'm quite excited about this.
I feel a bit uneasy about the test case. The existing tests run through a number of valid and invalid scenarios for the default validator. But currently you're replacing the default validator, so now we don't know if the scenarios would pass or fail with the original validator.
I think instead we should have an additional test case that is explicitly to test that swapping out the validator is respected by ::validateProps
.
Honestly I don't know why we need to be testing the behaviour of the codeblock plugin. This is provided by CKE5 core and is thoroughly tested there. We should only be testing that we can configure the languages, and that the languages get passed to the plugin. I'd argue we don't need a JS test at all for this, but I guess a basic smoke test that the plugin is loading and the configured languages are available would be reasonable.
Came across this triaging issues for Bug Smash Initiative and have a few thoughts:
- Since we're moving to Navigation module I checked if that is also affected and it is
- Since Claro and Starterkit Theme are unaffected should we really be fixing it for Stark?
- As per bnjmnm a stylesheet is certainly better than a style attribute. Not just for customisations, but for Content Security Policy also.
- I don't think toolbar.module is the right component for this since the issue is really with the dialog. Probably Javascript or Stark would be better. Even if it's a css fix, it's a javascript component we're styling.
Converted to an MR, updated line numbers in the IS.
Fixed \Drupal\Tests\views\Kernel\Plugin\ExposedFormRenderTest::testExposedFormRawInput
.
mstrelan → made their first commit to this issue’s fork.
Is this still relevant since
✨
Use one-time login link instead of user login form in BrowserTestBase tests
Fixed
? In most cases the test no longer requires the login form, unless the $useOneTimeLoginLinks
property is false. Furthermore, the report in the issue summary suggests you're using
Drupal Test Traits →
, so you could always override the drupalLogin
method in your own test base class.
Thanks @mondrake, some great points there. My initial thoughts on those:
1. I think automatically applying this rule everywhere in contrib and custom code would be quite disruptive, so it's almost a nice side effect that it would be opt-in. Ultimately I don't really care where it lives. It could probably even be a phpcs sniff that can be auto-fixed with phpcbf.
2. I think that could be handled separately in a follow up? Better to make progress than to get it perfect first go. Unless doing this makes things worse than the current state.
3. Absolutely. Keen to get more feedback first though.
Managed to fix 405 files and only had one fail. Have added a @phpstan-ignore to NodeAccessGrantsCacheContextTest for now. Changing from a Plan to a Task and setting Needs Review.
The IS mentions both Kernel and Functional tests should use \Drupal::service, but according to alexpott in #2699565-27: Replace \Drupal:: with $this->container->get() in test classes that allow dependency injection of Basic Auth module → this is only true for Functional tests.
I've created an MR with a phpstan rule to identify functional tests that access $this->container
. See https://git.drupalcode.org/issue/drupal-2066993/-/jobs/4841841 for results. We could land this and add the existing violations to the baseline. Then we can open follow up issues to cleanup the baseline. I suspect it should be simple enough to write a rector rule to replace $this->container->get()
with \Drupal::service()
.
mstrelan → made their first commit to this issue’s fork.
I think this is great, do we need something more generic long term?
Yeah the attribute I mentioned in the proposed resolution would be good to investigate. We can probably make use of the native phpunit attribtues for tests that are skipped on requirements like php extensions. Might be good to see if we can do another attribute for skipping on certain database driver too. But long term any permanently skipped tests should be fixed :joy:
mstrelan → created an issue.
Now we have yet-another-timeout to investigate - installing standard profile times out after 300 seconds:
Package Update (Drupal\Tests\package_manager\Build\PackageUpdate)
✘ Package update
┐
├ Symfony\Component\Process\Exception\ProcessTimedOutException: The process "/usr/local/bin/php ./core/scripts/drupal install standard --no-ansi" exceeded the timeout of 300 seconds.
│
│ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:1182
│ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:451
│ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:252
│ /builds/issue/drupal-3497701/core/tests/Drupal/BuildTests/Framework/BuildTestBase.php:336
│ /builds/issue/drupal-3497701/core/tests/Drupal/BuildTests/QuickStart/QuickStartTestBase.php:40
│ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/TemplateProjectTestBase.php:187
│ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/TemplateProjectTestBase.php:334
│ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/PackageUpdateTest.php:22
┴
This failed for me on local the second time I ran it. I set up a job to repeat it 100 times and it failed 5/100 times, see https://git.drupalcode.org/issue/drupal-3389365/-/jobs/4840855
mstrelan → made their first commit to this issue’s fork.
The plot thickens. Looks like package manager has a bunch of special handling around composer patches. I've managed to run a test locally that will patch php-tuf/composer-stager to pass the timeout through to the rsync command. I tested it my setting the timeout to 1 second in ApiController::createAndApplyStage
and it failed after 1 second instead of 120.
I have a repeat job running with the timeout set to 180 - https://git.drupalcode.org/issue/drupal-3497701/-/jobs/4832276. If that passes (maybe we run it a few times), then we can probably postpone this on https://github.com/php-tuf/composer-stager/pull/414
Is there a way we can refactor the core MR to do this? E.g. maybe CoreServiceProvider passes a container param with some hints of where to scan, the custom module can modify the container param and then the BundleClassCollectorPass can find them.
Wondering if this can be incorporated in to the core MR which now does its own discovery outside of plugins - https://git.drupalcode.org/project/drupal/-/merge_requests/11545
In its current state it searches for an Entity
subdir in all container.namespaces
. Would be worth seeing if it can be made to work with this use case.
I think this is by design:
* This function loads the body part of a partial HTML document and returns a
* full \DOMDocument object that represents this document.
and
* @param string $html
* The partial HTML snippet to load. Invalid markup will be corrected on
* import.
Both suggest it's only intended to deal with partial snippets, not full documents.
Pushed an update to assert the response content, and restore the previous execution time to see if that catches anything.
The 120 timeout is still curious. I can't get xdebug working inside the php local server so it's hard to step through. Looks like it would be passing 300 all the way down to \PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer::sync
which calls $this->environment->setTimeLimit($timeout);
but I'm not sure that actually affects anything, e.g. in \PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer::runCommand
the timeout is not passed to $this->rsync->run
so it looks like it would just fallback to 120.
1. We should try to optimise the test and/or package_manager so we don't need a 30 second timeout.
I must admit I haven't tried to grok everything this test is trying to achieve, but it feels like PackageUpdateTest::testPackageUpdate
could use a mocked response from the controller, and the controller could be tested independently. But maybe that's losing end-to-end coverage that we're trying to test for.
2. We should try to make sure that fatal errors result in a 500 instead of 200 so we don't get false negatives on 200 assertions.
Another approach would be to return a success string from the controller and assert that we get that in the response. That would catch the OOM and PHP timeout issues we were seeing locally.
https://git.drupalcode.org/project/drupal/-/jobs/4824652 is the repeat test class job running with that change, it's varying between 400-700 seconds to complete, so very slow indeed, but so far each iteration is passing.
That job passed 96 times and failed 4 times. This time instead of there being no error response we have an error to look at which I've included below, I've replaced html encoded entities for readability. The TL;DR is the rsync command is sometimes timing out after 2 minutes. This timeout seems to come from \PhpTuf\ComposerStager\API\Process\Service\ProcessInterface::DEFAULT_TIMEOUT
, although it looks like \Drupal\package_manager\StageBase::create
should be passing its own default, which is 300, so I'm not sure what's going on there.
Error response: The website encountered an unexpected error. Try again later.Drupal\package_manager\Exception\StageException: Failed to run process: <em class="placeholder">The process "'/usr/bin/rsync' '--archive' '--checksum' '--delete-after' '--verbose' '--exclude=/PACKAGE_MANAGER_FAILURE.yml' '--exclude=/web/sites/simpletest' '--exclude=/vendor/.htaccess' '--exclude=/web/sites/default/files' '--exclude=/web/sites/default/files/.sqlite' '--exclude=/web/sites/default/files/.sqlite-shm' '--exclude=/web/sites/default/files/.sqlite-wal' '--exclude=/recipes' '--exclude=/web/sites/default/default.settings.php' '--exclude=/web/sites/default/default.services.yml' '--exclude=/web/sites/default/settings.php' '--exclude=/web/sites/default/settings.local.php' '--exclude=/web/sites/default/services.yml' '--exclude=/package_manager_test_event.log' '--exclude=/tmp/.package_managerd1e3013e-0986-48a6-993d-25ac3459c1b2/ccKD8PwqcDLB4Q2kf2hMBv1EFnJulEs6' 'tmp/build_workspace_29b4565cc0c9b8a68abd3d8a6d0e7b8fGPvmzn/project/' 'tmp/.package_managerd1e3013e-0986-48a6-993d-25ac3459c1b2/ccKD8PwqcDLB4Q2kf2hMBv1EFnJulEs6'"; exceeded the timeout of 120 seconds.</em> in Drupal\package_manager\StageBase->rethrowAsStageException() (line 371 of core/modules/package_manager/src/StageBase.php). Symfony\Component\Process\Process->wait() (Line: 252)
├ Symfony\Component\Process\Process->run(Object, Array) (Line: 269)
├ Symfony\Component\Process\Process->mustRun(Object) (Line: 105)
├ PhpTuf\ComposerStager\Internal\Process\Service\Process->mustRun(Object) (Line: 81)
├ PhpTuf\ComposerStager\Internal\Process\Service\AbstractProcessRunner->run(Array, Object, Array, Object) (Line: 121)
├ PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer->runCommand(Object, Object, Object, Object) (Line: 58)
├ PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer->sync(Object, Object, Object, Object, 300) (Line: 38)
├ PhpTuf\ComposerStager\Internal\Core\Beginner->begin(Object, Object, Object, Object, 300) (Line: 43)
├ Drupal\package_manager\LoggingBeginner->begin(Object, Object, Object, Object, 300) (Line: 354)
├ Drupal\package_manager\StageBase->create() (Line: 111)
├ Drupal\package_manager_test_api\ApiController->createAndApplyStage(Object) (Line: 73)
├ Drupal\package_manager_test_api\ApiController->run(Object)
├ call_user_func_array(Array, Array) (Line: 123)
├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() (Line: 593)
├ Drupal\Core\Render\Renderer->executeInRenderContext(Object, Object) (Line: 121)
├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->wrapControllerExecutionInRenderContext(Array, Array) (Line: 97)
├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() (Line: 183)
├ Symfony\Component\HttpKernel\HttpKernel->handleRaw(Object, 1) (Line: 76)
├ Symfony\Component\HttpKernel\HttpKernel->handle(Object, 1, 1) (Line: 53)
├ Drupal\Core\StackMiddleware\Session->handle(Object, 1, 1) (Line: 48)
├ Drupal\Core\StackMiddleware\KernelPreHandle->handle(Object, 1, 1) (Line: 28)
├ Drupal\Core\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 32)
├ Drupal\big_pipe\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 116)
├ Drupal\page_cache\StackMiddleware\PageCache->pass(Object, 1, 1) (Line: 90)
├ Drupal\page_cache\StackMiddleware\PageCache->handle(Object, 1, 1) (Line: 48)
├ Drupal\Core\StackMiddleware\ReverseProxyMiddleware->handle(Object, 1, 1) (Line: 51)
├ Drupal\Core\StackMiddleware\NegotiationMiddleware->handle(Object, 1, 1) (Line: 53)
├ Drupal\Core\StackMiddleware\AjaxPageState->handle(Object, 1, 1) (Line: 51)
├ Drupal\Core\StackMiddleware\StackedHttpKernel->handle(Object, 1, 1) (Line: 715)
├ Drupal\Core\DrupalKernel->handle(Object) (Line: 19)
├ require('/tmp/build_workspace_29b4565cc0c9b8a68abd3d8a6d0e7b8fGPvmzn/project/web/index.php') (Line: 71)
That's the same experience I had. Most likely it is timing out after 1 sec but still a 200. If you assert that the response is empty it will probably fail and show the timeout error.
Side note I wonder if we can get htmlOutput to work here.
AFK now but there is a constant in one of the test base classes that's setting it to 20 seconds. I'm not sure if there is something else about my local that's causing it to time out though.
FWIW I can replicate the exact error by adding this to the start of \Drupal\package_manager_test_api\ApiController::run
http_response_code(500);
die();
Running this test locally always fails for me, but it fails asserting the expected versions. That means makePackageManagerTestApiRequest
is passing.
I found it odd that the Error response:
in the output is empty, thinking maybe an error was thrown and system.logging error_logging was set to hide instead of verbose, but that's not the case. I explicitly put some errors in \Drupal\package_manager_test_api\ApiController::run
but I could always see the error response in the output.
I then thought maybe there is an OOM or timeout. I didn't have much luck down the OOM route, but I did find that if I add this line to makePackageManagerTestApiRequest
it would fail, even after asserting the 200 status code.
$this->assertSame('Finish', $session->getPage()->getContent());
1) Drupal\Tests\package_manager\Build\PackageUpdateTest::testPackageUpdate
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'Finish'
+'<br />
+<b>Fatal error</b>: Maximum execution time of 20 seconds exceeded in <b>/tmp/build_workspace_4d8944e5916659803141418ed3171c99nCcLIM/project/vendor/php-tuf/composer-stager/src/Internal/Finder/Service/FileFinder.php</b> on line <b>68</b><br />'
I'm not sure exactly how that helps, but hoping it can spark some ideas.
Rebased and updated the MR with a suggested fix, updated proposed resolution.
mstrelan → made their first commit to this issue’s fork.
I repeated the original job that failed 1 in 3 times and now it's not failing at all. https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4820216
Not sure what to make of that, maybe we don't need the trait after all. But I think the waitForElementVisible makes sense.
mstrelan → changed the visibility of the branch 3317520-settings-tray-repeat to hidden.
Using WaitTerminateTestTrait
seems to allow us to unskip ::testEditModeEnableDisable
without splitting to a separate class - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819919
But it failed once where I removed the usleep. Instead of usleep, why don't we just use waitForElementVisible
? This seems to pass 100/100 times - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819970. And once more for good measure https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819989.
Have created another MR with just these fixes and updated the issue summary.
Perfect. I've reviewed the MR and can see the duplicate has been removed and there is still one Spell-checking job in tact. I checked the pipeline and can see the Spell-checking job is running as expected.
I tried a few things, results below:
- Unskipping
::testEditModeEnableDisable
and repeating the class 1000 times resulted in a fail approx 1 in 3 times (skipped after 38 tests) - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819032 - Skipping
::testBlocks
and::testValidationMessages
and repeating the class 100 times resulted in 0 fails - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819193 - As above, but with usleep removed, 0 fails - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819250
- Moving
::testEditModeEnableDisable
to its own class and repeating the original class, 0 fails - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819383 - As above, but repeating the new test class, 0 fails - https://git.drupalcode.org/issue/drupal-3317520/-/jobs/4819468
So it seems if we simply split ::testEditModeEnableDisable
to its own class we can unskip it, but also appears we might be able to remove the usleep
. I don't have an explanation for why that might be the case though, I wonder if there's something strange going on with switching themes and if WaitTerminateTestTrait would help.
mstrelan → created an issue.
Adding tags
mstrelan → created an issue.
mstrelan → made their first commit to this issue’s fork.
Interesting stuff @murz. For the record, I don't have a preference whether we use Nightwatch, Playwright, Cypress or PHPUnit, they're all Selenium under the hood afaik. The main issues I have with the existing tests is the amount of setup required to perform a test. When debugging a failing test recently I found the reason the test failed was because the user didn't have permission for an element to exist. Then I discovered that the setup steps involved installing the site via the UI, logging in as admin, creating a role via the UI, assigning permissions by checking boxes via the UI, logging out, logging in again, creating a user with the new role, logging out, logging in again, enabling a module, logging out then finally logging the user in. In the step where the permissions were assigned, the browser dimensions were not big enough to be able to check the boxes to assign the permissions. The test carried on anyway and then eventually failed. Our Functional JavaScript tests allow us to offload most of that setup to direct API calls in PHP. If we can replicate that in any of these front end test tools then I think we're solving most of the problem.
Thanks for opening this, I was the one who suggested this in Slack. I had a bit of a tinker but I couldn't figure out the relationship between the child pipeline and the initial stage(s). Will be following this issue.
This seems very similar to #3041318: Various random fails due to mis-triggered Mink deprecation error → , should we combine this issue in to that one?
Converted #18 to an MR and hiding patches. The test in ManageFieldsFunctionalTest had moved ManageFieldsLifecycleTest. Also addressed probable issues in EntityQueryAccessTest as reported by @xjm in #23. I didn't find anything to update in WorkspaceIntegrationTest so not much I can do there without a failed test output. Not sure about repeating tests to prove this one, I guess we could try it for one or more of the test classes.
mstrelan → made their first commit to this issue’s fork.
Is this still relevant since 📌 Optimize last_write_timestamp writes in ChainedFastBackend Active ?
Apologies I didn't find this when I opened 📌 Move test middleware out of CoreServiceProvider Active . Wondering if this is still relevant? If we can avoid the container rebuild that would be ideal, but not sure if it's worth changing now.
I forgot to actually commit the code for Run 1 with WaitTerminateTestTrait, which is why it had 2 fails. I've moved the setWaitForTerminate
call where I intended it to be the first time and ran it again:
Run 4 - 0 fails
I've hidden the 2 test branches and opened a new branch (MR !11640) with only the fix. This is ready for review.
mstrelan → changed the visibility of the branch 3515404-repeat-fix to hidden.
mstrelan → changed the visibility of the branch 3515404-repeat to hidden.
Repeat 1000 times:
Run 1 - 7/1000 fails
Run 2 - 14/1000 fails
Run 3 - 10/1000 fails
With WaitTerminateTestTrait:
Run 1 - 2/1000 fails
Run 2 (after moving setWaitForTerminate) - 0/1000 fails
Run 3 - 0/1000 fails
Now that the low-hanging fruit is committed I've rebased the MR and commented on some of the less obvious changes. I've updated the issue summary with a proposed resolution on how to split the remaining work. Setting NR for agreement on that approach, and for eyes on the MR.
CSP without unsafe-inline will not allow the onload attribute, but you can use an event listener in an inline script with a nonce or hash instead.
// Test that logged in user does not get logged out in maintenance mode
// when hitting jsonapi route.
$this->container->get('state')->set('system.maintenance_mode', FALSE);
$this->drupalLogin($this->userCanViewProfiles);
$this->container->get('state')->set('system.maintenance_mode', TRUE);
$this->drupalGet('/jsonapi/taxonomy_term/tags');
$this->assertSession()->statusCodeEquals(503);
Sounds like state cache race condition. Probably needs WaitTerminateTestTrait
.
Looks like the random fails we saw in 134 are creeping back in, and we've got a green run on 134 with all the additional waits added, so maybe we should just stick with that.
The fails here are all due to resizing the browser. It seems in 128+ the window.innerHeight
is always 139px less than the height you resize to. I tried to debug this in non-headless mode, which was even weirder because resizing to 400px wide would set it to 500px width.
It seems that the 139px includes the browser chrome (toolbar, infobar, etc), but I'm not sure if the difference is in how the resizeWindow
works, or in how the browser reports window.innerHeight
. Either way, using window.outerHeight
makes BrowserWithJavascriptTest::testJavascript
pass.
There was an option to use --headless=old
which would mean we don't need to refactor any tests, but this has been removed in Chrome 132.
I was able to apply the same logic to LayoutBuilderDisableInteractionsTest::testFormsLinksDisabled
and increase the height by 139px, but it seems like a bit of a hack.
Tried the same approach for the Nightwatch tests but it seems like there must be something else going on there.
I found 📌 Provide the ability for JS functional tests to track whether the page has been reloaded Closed: outdated and #2936122: Find out why JavscriptTestBase occasionally needs a waitForElement on a normal page load → which are related.
Given the amount of the changes here maybe it's better to just go to 133 first and see if any big brains have better solutions for 134. Opened 📌 Update to selenium/standalone-chrome:133 Active and postponing this.
mstrelan → created an issue.
mstrelan → created an issue.
This seems much worse since 📌 Update to selenium/standalone-chrome:128 Active
Personal question when testing for multiple loops why add - <<: *with-selenium-chrome ?
The default config for repeating tests doesn't include a chrome service, which we need for JS tests. Perhaps it should by default, since a large percentage of random fails are in JS tests.
mstrelan → changed the visibility of the branch 3471113-chrome-133 to hidden.
mstrelan → changed the visibility of the branch 3471113-chrome-update-deps to hidden.
I ran in to InlineBlockTest as and also wondered why it's a FunctionalJavascript test. Looks like there are a few other tests using the same base class that probably don't need javascript either. Possibly we need to keep some of this using WebDriverTestBase but there is so much of this that could move to BrowserTestBase.
re #8: I'm hoping for something more generic, it's quite burdensome figuring out what to wait for after ::pressButton
. I've introduced a waitForDocumentReady
function which seems to work for most cases. Still not sure what changed in selenium or chrome that means this is necessary though. Still working through the fails but pushing my progress so far.
Good spotting, I'm not sure why that one didn't make it initially. I generated the baseline at level 2, then updated that method and regenerated it. This removed a further 96 errors from the baseline, so I think that's worth it.
Using WaitTerminateTestTrait
fixes this. Updated the IS with links to test runs.
mstrelan → changed the visibility of the branch 3387939-repeat-fix to hidden.
mstrelan → changed the visibility of the branch 3387939-repeat to hidden.
Thanks for addressing the isset feedback. I've left one more minor nit but happy to leave at RTBC.
Ran in to this again at https://git.drupalcode.org/issue/drupal-3316317/-/jobs/4767068, The line number is now 143, not 141, but otherwise the same.
This one is unusual because it's not a javascript test, so it's not a matter of waiting until the class is applied, it should have been there from the start. I've attached the browser output from a fail and a pass, and sure enough the claro-
classes don't appear in the output of the failed test. The only other differences are the assets query string and the user permissions hash.
I looked at where the claro-toolbar
and claro-toolbar-menu
classes come from and they are added directly in twig files in the claro theme. These files are overrides of twig files in the toolbar module, with no other changes, so this points to an issue with theme suggestions not being correctly applied. The test is loading the page with initial config, then setting config via API, then loading the same page again. I'm thinking we need to manually clear a cache, or perhaps wait for a cache to finish writing before loading the page.
Opened 2 MRs (both now closed) and ran the test 500 times in each. I think this one was more infrequent than 1/500 so unfortunately we didn't catch it. It's also probably not worth retrying, because even if we do catch it once in another 500 run, it doesn't really confirm that we've fixed it. I think this can go back to RTBC, but will leave that for someone else.
mstrelan → changed the visibility of the branch 3514924-repeat-fix to hidden.
mstrelan → changed the visibility of the branch 3514924-repeat to hidden.
Remove extra line as per @longwave
Added two MRs (now hidden) that repeated the test 500 times, one with the fix and one without. The test didn't fail for either. I think that's to be expected though, because we're not fixing anything here, we're removing a workaround because we think the problem is already fixed.
mstrelan → changed the visibility of the branch 3316317-repeat-fix to hidden.
mstrelan → changed the visibility of the branch 3316317-repeat to hidden.
Not sure what's changed, but it seems after any kind of pressButton
we need to wait for the next page to load. I don't know the best way to do that, but for proof of concept I've introduced a waitForAddressEquals
helper. Hope there is a more generic way.
Have rebased and restored the stragglers
Tried with Chrome 133 and 134, both seem to have failing tests. Also tried to only upgrade in the latest deps job but I guess I messed up the formatting.
mstrelan → made their first commit to this issue’s fork.
Switched to replacing 3 times with 500ms delay. The theory is that if it takes 4400ms for 1100 iterations, minus 1100ms in delays, that's 3300ms computation time, or around 3ms per iteration. A slower runner that is timing out might be taking 9ms per iteration, adding up to 11000 ms. But if the slow runner only has to do the work 3 times that's only ~27ms, on top of the 1500ms delay. Should be much less likely to time out.
MR !11584 demonstrates how this would fail on a slow runner, where the loop took more than 10 seconds to update the element 1100 times.
The controller was added in
#3211164: Random errors in Javascript Testing →
when we had WebDriverCurlService
but that's not really relevant anymore. I'm also not sure why we need to replace it 1100 times, it seems we could simplify this by replacing an element once or twice, at an interval of around 500-1000 ms.
mstrelan → created an issue.
Not sure if the original issue is still relevant for Gitlab CI. Rolled the patch in to an MR to remove the workaround. Guess we could try run it many times?
mstrelan → made their first commit to this issue’s fork.
Providing an update on the current stats:
HEAD (Level 1): 9606
HEAD (Level 2): 15067 (+5461)
HEAD + 3325057 (Level 2) 14636 (+5030)
When this issue was opened there were 9000+ additional errors, now we're down to around 5000.
FWIW the original pipeline for 📌 Check if we still need details.css Active passed before 📌 Move resize CSS into its own library Active was committed.