- 🇦🇺Australia mstrelan
Now we have yet-another-timeout to investigate - installing standard profile times out after 300 seconds:
Package Update (Drupal\Tests\package_manager\Build\PackageUpdate) ✘ Package update ┐ ├ Symfony\Component\Process\Exception\ProcessTimedOutException: The process "/usr/local/bin/php ./core/scripts/drupal install standard --no-ansi" exceeded the timeout of 300 seconds. │ │ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:1182 │ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:451 │ /builds/issue/drupal-3497701/vendor/symfony/process/Process.php:252 │ /builds/issue/drupal-3497701/core/tests/Drupal/BuildTests/Framework/BuildTestBase.php:336 │ /builds/issue/drupal-3497701/core/tests/Drupal/BuildTests/QuickStart/QuickStartTestBase.php:40 │ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/TemplateProjectTestBase.php:187 │ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/TemplateProjectTestBase.php:334 │ /builds/issue/drupal-3497701/core/modules/package_manager/tests/src/Build/PackageUpdateTest.php:22 ┴
- 🇦🇺Australia mstrelan
The plot thickens. Looks like package manager has a bunch of special handling around composer patches. I've managed to run a test locally that will patch php-tuf/composer-stager to pass the timeout through to the rsync command. I tested it my setting the timeout to 1 second in
ApiController::createAndApplyStage
and it failed after 1 second instead of 120.I have a repeat job running with the timeout set to 180 - https://git.drupalcode.org/issue/drupal-3497701/-/jobs/4832276. If that passes (maybe we run it a few times), then we can probably postpone this on https://github.com/php-tuf/composer-stager/pull/414
- 🇦🇺Australia mstrelan
Pushed an update to assert the response content, and restore the previous execution time to see if that catches anything.
The 120 timeout is still curious. I can't get xdebug working inside the php local server so it's hard to step through. Looks like it would be passing 300 all the way down to
\PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer::sync
which calls$this->environment->setTimeLimit($timeout);
but I'm not sure that actually affects anything, e.g. in\PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer::runCommand
the timeout is not passed to$this->rsync->run
so it looks like it would just fallback to 120. - 🇦🇺Australia mstrelan
1. We should try to optimise the test and/or package_manager so we don't need a 30 second timeout.
I must admit I haven't tried to grok everything this test is trying to achieve, but it feels like
PackageUpdateTest::testPackageUpdate
could use a mocked response from the controller, and the controller could be tested independently. But maybe that's losing end-to-end coverage that we're trying to test for.2. We should try to make sure that fatal errors result in a 500 instead of 200 so we don't get false negatives on 200 assertions.
Another approach would be to return a success string from the controller and assert that we get that in the response. That would catch the OOM and PHP timeout issues we were seeing locally.
https://git.drupalcode.org/project/drupal/-/jobs/4824652 is the repeat test class job running with that change, it's varying between 400-700 seconds to complete, so very slow indeed, but so far each iteration is passing.
That job passed 96 times and failed 4 times. This time instead of there being no error response we have an error to look at which I've included below, I've replaced html encoded entities for readability. The TL;DR is the rsync command is sometimes timing out after 2 minutes. This timeout seems to come from
\PhpTuf\ComposerStager\API\Process\Service\ProcessInterface::DEFAULT_TIMEOUT
, although it looks like\Drupal\package_manager\StageBase::create
should be passing its own default, which is 300, so I'm not sure what's going on there.Error response: The website encountered an unexpected error. Try again later.Drupal\package_manager\Exception\StageException: Failed to run process: <em class="placeholder">The process "'/usr/bin/rsync' '--archive' '--checksum' '--delete-after' '--verbose' '--exclude=/PACKAGE_MANAGER_FAILURE.yml' '--exclude=/web/sites/simpletest' '--exclude=/vendor/.htaccess' '--exclude=/web/sites/default/files' '--exclude=/web/sites/default/files/.sqlite' '--exclude=/web/sites/default/files/.sqlite-shm' '--exclude=/web/sites/default/files/.sqlite-wal' '--exclude=/recipes' '--exclude=/web/sites/default/default.settings.php' '--exclude=/web/sites/default/default.services.yml' '--exclude=/web/sites/default/settings.php' '--exclude=/web/sites/default/settings.local.php' '--exclude=/web/sites/default/services.yml' '--exclude=/package_manager_test_event.log' '--exclude=/tmp/.package_managerd1e3013e-0986-48a6-993d-25ac3459c1b2/ccKD8PwqcDLB4Q2kf2hMBv1EFnJulEs6' 'tmp/build_workspace_29b4565cc0c9b8a68abd3d8a6d0e7b8fGPvmzn/project/' 'tmp/.package_managerd1e3013e-0986-48a6-993d-25ac3459c1b2/ccKD8PwqcDLB4Q2kf2hMBv1EFnJulEs6'"; exceeded the timeout of 120 seconds.</em> in Drupal\package_manager\StageBase->rethrowAsStageException() (line 371 of core/modules/package_manager/src/StageBase.php). Symfony\Component\Process\Process->wait() (Line: 252) ├ Symfony\Component\Process\Process->run(Object, Array) (Line: 269) ├ Symfony\Component\Process\Process->mustRun(Object) (Line: 105) ├ PhpTuf\ComposerStager\Internal\Process\Service\Process->mustRun(Object) (Line: 81) ├ PhpTuf\ComposerStager\Internal\Process\Service\AbstractProcessRunner->run(Array, Object, Array, Object) (Line: 121) ├ PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer->runCommand(Object, Object, Object, Object) (Line: 58) ├ PhpTuf\ComposerStager\Internal\FileSyncer\Service\FileSyncer->sync(Object, Object, Object, Object, 300) (Line: 38) ├ PhpTuf\ComposerStager\Internal\Core\Beginner->begin(Object, Object, Object, Object, 300) (Line: 43) ├ Drupal\package_manager\LoggingBeginner->begin(Object, Object, Object, Object, 300) (Line: 354) ├ Drupal\package_manager\StageBase->create() (Line: 111) ├ Drupal\package_manager_test_api\ApiController->createAndApplyStage(Object) (Line: 73) ├ Drupal\package_manager_test_api\ApiController->run(Object) ├ call_user_func_array(Array, Array) (Line: 123) ├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() (Line: 593) ├ Drupal\Core\Render\Renderer->executeInRenderContext(Object, Object) (Line: 121) ├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->wrapControllerExecutionInRenderContext(Array, Array) (Line: 97) ├ Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() (Line: 183) ├ Symfony\Component\HttpKernel\HttpKernel->handleRaw(Object, 1) (Line: 76) ├ Symfony\Component\HttpKernel\HttpKernel->handle(Object, 1, 1) (Line: 53) ├ Drupal\Core\StackMiddleware\Session->handle(Object, 1, 1) (Line: 48) ├ Drupal\Core\StackMiddleware\KernelPreHandle->handle(Object, 1, 1) (Line: 28) ├ Drupal\Core\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 32) ├ Drupal\big_pipe\StackMiddleware\ContentLength->handle(Object, 1, 1) (Line: 116) ├ Drupal\page_cache\StackMiddleware\PageCache->pass(Object, 1, 1) (Line: 90) ├ Drupal\page_cache\StackMiddleware\PageCache->handle(Object, 1, 1) (Line: 48) ├ Drupal\Core\StackMiddleware\ReverseProxyMiddleware->handle(Object, 1, 1) (Line: 51) ├ Drupal\Core\StackMiddleware\NegotiationMiddleware->handle(Object, 1, 1) (Line: 53) ├ Drupal\Core\StackMiddleware\AjaxPageState->handle(Object, 1, 1) (Line: 51) ├ Drupal\Core\StackMiddleware\StackedHttpKernel->handle(Object, 1, 1) (Line: 715) ├ Drupal\Core\DrupalKernel->handle(Object) (Line: 19) ├ require('/tmp/build_workspace_29b4565cc0c9b8a68abd3d8a6d0e7b8fGPvmzn/project/web/index.php') (Line: 71)
- 🇬🇧United Kingdom catch
https://git.drupalcode.org/project/drupal/-/jobs/4824652 is the repeat test class job running with that change, and even 10 or so tests in, the test is taking longer and longer each iteration to complete.
Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 429s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 445s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 457s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 497s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 509s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 516s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 517s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 525s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 558s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 584s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 595s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 620s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 638s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 663s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 668s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 673s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 674s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 676s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 677s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 677s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 677s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 678s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 679s Drupal\Tests\package_manager\Build\PackageUpdateTest 1 passes 679s
I wonder if we are hitting github rate limits or something like that.
- 🇬🇧United Kingdom catch
Ahh thanks this is very confusing.
I change the max execution time to 30 and it passed locally, pushed an MR for this.
I think we need two spin-off issues:
1. We should try to optimise the test and/or package_manager so we don't need a 30 second timeout.
2. We should try to make sure that fatal errors result in a 500 instead of 200 so we don't get false negatives on 200 assertions.
- @catch opened merge request.
- 🇦🇺Australia mstrelan
That's the same experience I had. Most likely it is timing out after 1 sec but still a 200. If you assert that the response is empty it will probably fail and show the timeout error.
Side note I wonder if we can get htmlOutput to work here.
- 🇬🇧United Kingdom catch
This test fails for me locally with:
1) Drupal\Tests\package_manager\Build\PackageUpdateTest::testPackageUpdate Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'1.1.0' +'1.0.0' /var/www/html/core/modules/package_manager/tests/src/Build/PackageUpdateTest.php:63
which looks unrelated to the random test failure but also means I can't get a green test run to then try to make it fail the same way from there. Also tried reducing TemplateProjectTestBase::MAX_EXECUTION_TIME to 1 and that doesn't change anything for me.
- 🇦🇺Australia mstrelan
AFK now but there is a constant in one of the test base classes that's setting it to 20 seconds. I'm not sure if there is something else about my local that's causing it to time out though.
- 🇬🇧United Kingdom catch
Fatal errors can often be 200s, I think we have an issue somewhere.
The timeout is not good, probably reflects something that could happen during actual operation.
Do we know where the 20 seconds max execution time is coming from? Bit surprised it's not 30 seconds. We could increase that to see if it helps the random failures and open a separate issue for the timeout.
- 🇦🇺Australia mstrelan
FWIW I can replicate the exact error by adding this to the start of
\Drupal\package_manager_test_api\ApiController::run
http_response_code(500); die();
- 🇦🇺Australia mstrelan
Running this test locally always fails for me, but it fails asserting the expected versions. That means
makePackageManagerTestApiRequest
is passing.I found it odd that the
Error response:
in the output is empty, thinking maybe an error was thrown and system.logging error_logging was set to hide instead of verbose, but that's not the case. I explicitly put some errors in\Drupal\package_manager_test_api\ApiController::run
but I could always see the error response in the output.I then thought maybe there is an OOM or timeout. I didn't have much luck down the OOM route, but I did find that if I add this line to
makePackageManagerTestApiRequest
it would fail, even after asserting the 200 status code.$this->assertSame('Finish', $session->getPage()->getContent());
1) Drupal\Tests\package_manager\Build\PackageUpdateTest::testPackageUpdate Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'Finish' +'<br /> +<b>Fatal error</b>: Maximum execution time of 20 seconds exceeded in <b>/tmp/build_workspace_4d8944e5916659803141418ed3171c99nCcLIM/project/vendor/php-tuf/composer-stager/src/Internal/Finder/Service/FileFinder.php</b> on line <b>68</b><br />'
I'm not sure exactly how that helps, but hoping it can spark some ideas.
Automatically closed - issue fixed for 2 weeks with no activity.
- 🇦🇺Australia mstrelan
Converted #18 to an MR and hiding patches. The test in ManageFieldsFunctionalTest had moved ManageFieldsLifecycleTest. Also addressed probable issues in EntityQueryAccessTest as reported by @xjm in #23. I didn't find anything to update in WorkspaceIntegrationTest so not much I can do there without a failed test output. Not sure about repeating tests to prove this one, I guess we could try it for one or more of the test classes.
- @mstrelan opened merge request.
- First commit to issue fork.