"Directory not empty" warning triggered by something in CI

Created on 17 January 2025, about 2 months ago

Problem/Motivation

Getting warnings on "next minor" tests that seem to be unrelated to the tested code.

2 tests triggered 2 PHP warnings:
1) /builds/project/content_fixtures/web/core/lib/Drupal/Core/File/FileSystem.php:261
rmdir(sites/simpletest/14332366): Directory not empty
Triggered by:
* Drupal\Tests\content_fixtures_example\Functional\ExampleFixturesTest::testCountFixtures
/builds/project/content_fixtures/modules/content_fixtures_example/tests/src/Functional/ExampleFixturesTest.php:71
2) /builds/project/content_fixtures/web/core/lib/Drupal/Core/File/FileSystem.php:261
rmdir(sites/simpletest/47771217): Directory not empty
Triggered by:
* Drupal\Tests\content_fixtures_example\Functional\ExampleFixturesTest::testArticleWithImageFixture
/builds/project/content_fixtures/modules/content_fixtures_example/tests/src/Functional/ExampleFixturesTest.php:99

Note that the directories triggering this are created by the pipeline (sites/simpletest/*) so this seems to be triggered outside of the scope of the tested code.

Steps to reproduce

Examples of the problem can be seen here:
- https://git.drupalcode.org/project/content_fixtures/-/pipelines/398856
- https://git.drupalcode.org/project/content_fixtures/-/pipelines/398943

It is reproducible in this specific project.

Proposed resolution

I have no idea. I'm not even sure if that's the correct place to report it, but I was redirected here from Slack channel.

🐛 Bug report
Status

Active

Component

gitlab-ci

Created by

🇵🇱Poland Luke_Nuke

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @Luke_Nuke
  • 🇬🇧United Kingdom jonathan1055

    Hi luke_nuke,
    I see that this project has only just enabled gitlab testing, so this problem is not something that has just been introduced in any recent gitlab templates update. Do you recall whether these tests did the same thing when they were run on DrupalCI? Maybe that was on an earlier PHPUnit version? Maybe the problem was there but PHPUnit did not complain then.

    Next Minor is running Drupal: 11.2-dev whereas "Current" is running Drupal: 11.1.1. But they both use PHP 8.3.16 and PHPUnit 10.5.41 so it must be a Drupal Core config difference which is driving this.

    The project does not have its own phpunti.xml(.dist) file so the config from core/phpunit.xml will be used. Maybe compare to see if that has changed between 11.1 and 11.2 ?

  • 🇵🇱Poland Luke_Nuke

    Hello jonathan1055.

    I don't remember any warnings on DrupalCI. Thanks for pointing out the differences in the phpunit.xml file. I compared 11.x and 11.1.x branches. There are some, but I don't see how they could be relevant to anything. Actually I tried to reproduce it locally, but couldn't, so there has to be something more to it than the phpunit alone.

  • 🇬🇧United Kingdom jonathan1055

    Maybe the first thing we need to establish is whether those directories are empty in 'current' 11.1 (so that the warning is not produced) or they are not empty but the check is not done and/or not reported. That would help in understanding whether 'next major' is doing something different to leave those directories not empty, or whether it is just reporting the fact differently.

    You could open a MR and customise the .gitlab-ci.yml add additional after_script</cocde> within <code>phputnit: attempting to show the contents of all the folders in sites/simpletest/. That will be used for all of the phpunit jobs, as all the variants extend from the main 'phpunit' job.

  • 🇵🇱Poland Luke_Nuke

    Upon further investigation, the file in `sites/simpletest/*` is `error.log`. The content of this file can be seen here: https://git.drupalcode.org/project/content_fixtures/-/jobs/4062256 . Looks like Drupal instance in the pipeline runs cron (?) and cannot log it because of the missing db tables, which causes creation of the `error.log` file, which triggers warnings during cleanups?

    I don't know if the actual error is lack of the tables or the fact that Drupal attempts to run its cron jobs when it shouldn't.

    Moreover, this seem to doesn't happen on MR-related pipelines, only on branches.

  • 🇵🇱Poland Luke_Nuke

    I compared Drupal branches: 11.x and 11.1.x: https://git.drupalcode.org/project/drupal/-/compare/11.x..11.1.x

    There seems to be cron related code added to the `install_finished` function, that may be relevant to the issue.

      // Run cron to populate update status tables (if available) so that users
      // will be warned if they've installed an out of date Drupal version.
      // Will also trigger indexing of profile-supplied content or feeds.
      \Drupal::service('cron')->run();
    

    If that's the reason for the issue happening in the "next minor" pipeline, then we need to determine why it causes trouble on branch related pipelines and not MR related pipelines.

  • 🇬🇧United Kingdom jonathan1055

    It seems that this cron stuff was added to... 11.1.x branch, so the one that actually works.

    Actually those cron lines in install_finished() have been there for years, are still in 11.1.x, but just recently removed from 11.x (which is what 'next minor' is testing with, until 11.2.x is created). See 📌 Remove the automatic cron run from the installer Needs work . So it can't be that cron run which is being executed when the error is caused.

    The errorlog is also available as a download artifact. I checked the contents of the file and they are slightly different which is odd. The table name and timestamp vary. The artifact has 1737223604 = which is 18:06:44 and the file you are showing in the job log has 1737223618 = 18:06:58 (14 seconds later), which means that this log you have displayed in the job is not the one we get on the download. But that is not the real issue here.

    Some other difference bteween 11.1.1 and 11.x must be causing the changed behavior. This comparison of 11.1.1 (left) vs 11.x (right) is very long, as many files are hidden.

  • 🇪🇸Spain fjgarlin

    It seems more and more like a difference between Drupal core versions. Core is trying to clean up a folder where some files have been placed. Also, these are warnings and they are failing because core configuration makes the job fail on warnings: https://git.drupalcode.org/project/drupal/-/blob/11.x/core/phpunit.xml.d...

    This is something that we could potentially control if/when 🐛 beStrictAboutOutputDuringTests=true or failOnRisky=true causing test failures due to PHP8.4 Implicit Nullable deprecation Active is done. But it really seems like a core issue. I guess they are not affected because they are running the tests with "run-tests.sh" as opposed to this example that is using PHPUnit binary (more info on how to switch between these in here https://project.pages.drupalcode.org/gitlab_templates/jobs/phpunit/).

    It'd be great to identify the exact file(s) being added to those folders. Then based on that, we can see if we need to do some tweaks or maybe report the issue in core.

  • 🇬🇧United Kingdom jonathan1055

    Funnily enough I was going to suggest trying with _PHPUNIT_CONCURRENT: 1 when discussing this with @luke_nuke on Slack. That would be worth doing, just to see the difference. Also what puzzles me is why there is no problem when the job is run in a MR pipeline. It fails for default branch testing. I wonder if it fails on an issue fork that it triggered manually via the UI. I can try that on 📌 Gitlab pipeline testing Active

  • 🇬🇧United Kingdom jonathan1055

    Hey I was not expecting this! ... I got the job to fail in the MR, which it has not done up to now.

    Before, with the defaul _PHPUNIT_CONCURRENT: 0 so it runs the plain phpunit binary - both jobs pass
    https://git.drupalcode.org/issue/content_fixtures-3500668/-/pipelines/40...

    Add _PHPUNIT_CONCURRENT: 1 so it uses run-tests.sh and the 'next minor' fails with rmdir() directory not empty.
    https://git.drupalcode.org/issue/content_fixtures-3500668/-/pipelines/40...

    So this is not necessarily a difference between Core 11.0 and 11.1, it also has to do with the differing config in phpunit.xml as @fjgalin noted above. The passing non-concurrent job gets phpunit.xml from Gitlab Templates /assets (because the project does not have its own), but the concurrent=1 run-tests.sh does not do that. I don't know what config options are set in that case, but I think this is the cause of the difference.

    Does any of this get us nearer to finding out what is in the directory when it is expected to be empty? At least we can now replicate the error in a MR and look further.

  • 🇪🇸Spain fjgarlin

    Maybe we can add a patch to be applied in core in core/lib/Drupal/Core/File/FileSystem.php:261

        else {
           +$objects = scandir($uri);
           +if (count($objects) != 2) {
           +   var_dump($objects);
           +}
          return rmdir($uri);
        }
    
  • 🇬🇧United Kingdom jonathan1055

    Thanks for that idea, and for providfing the patch. I first tried the "expanded" patches.json format and it failed with "Array to string conversion", no idea what was wrong
    https://git.drupalcode.org/issue/content_fixtures-3500668/-/pipelines/40...

    But I changed to the compact patches format and this worked fine.
    https://git.drupalcode.org/issue/content_fixtures-3500668/-/pipelines/40...

    However, in both runs* the 'next minor' pipeline ran OK and did not fail on rmdir as reported in #12. Could this be a random error? I will run the pipeline again.

    [* both runs, because on the first pipeline I accidently was patching 'composer' not 'composer (next minor)' which menat that the next minor job was not affected, and ran]

  • 🇬🇧United Kingdom jonathan1055

    I think it is random, as I re-ran the green next minor and got the failure :-)
    https://git.drupalcode.org/issue/content_fixtures-3500668/-/jobs/4097189

  • 🇪🇸Spain fjgarlin

    Am I reading the output well? It seems that only when there is a failure, an "error.log" file is written to the folder and that's why the folder isn't empty. However, the jobs don't seem to capture that error at all, I wonder why.

    It might be that core is not capturing that type of exception (DatabaseExceptionWrapper)... Actually, the code here https://git.drupalcode.org/project/drupal/-/blob/11.x/core/modules/dblog... captures the exception but then does not trigger a new one, but if the error.log file is populated with that information that would explain the issue.

    So yeah, it really seems like a random error more and more, and when it happens, it might be Drupal's fault that the directory is not empty (since we are not doing any htaccess configuration for this).

  • 🇬🇧United Kingdom jonathan1055

    Given that some random sql errors are known to occur (which seems to be the initial reason for the failures) is the main point here that the job just falls in a heap with nothing didsplayed in the job log? The information about missing watchdog table (which I think reallly means "no mysql") is shown in the downloadable apache.error.log artifact. But the unix timestamp being insterted there is a few seconds before the timestamp in the error.log file we have displayed in the job. Which means there have been two attempts at the insert, and it is the second occurence that triggers the exception we see in the job log? which in turn causes the rmdir() to fail.

  • 🇪🇸Spain fjgarlin

    I'm thinking more and more that if this is really not reproducible and it's just random, there isn't much we can do with it. If rerunning makes it work then we might need to live with it in some cases. This is something common in core as well.

    If/when we work on 🐛 beStrictAboutOutputDuringTests=true or failOnRisky=true causing test failures due to PHP8.4 Implicit Nullable deprecation Active , we should be able to set some options, like failOnWarnings to false, which would make cases like this green, if that's what we'd want.

    I'm not sure if you want to do more investigation, but if there isn't a reproducible step by step here we might need to close it with "cannot reproduce".

  • 🇬🇧United Kingdom jonathan1055

    Yes I accept that the original cause of the error may be random, as in the MySql database goes away or whatever. But you linked to a core function in #17 which could be the place where that information is not being trapped and displayed.

    Or are we saying "in these cases where nothing is shown in the phpunit job log we must always remember to browse/download the error.log artifact as that may have more info, and if that file shows sql errors like 'table not found' then its a random error and try re-running the job".

    If we can't do anything, and we also decide not to raise a core issue, then the above could be added to our phpunit doc page.

  • 🇪🇸Spain fjgarlin

    I keep on thinking that it's an issue with core, where the second fail is somehow "lost", or if the fail is fixed, it's failing to clean-up the folder before running the rmdir, but I don't know how to replicate, which will make it difficult if we create a core issue. Maybe mocking or triggering errors in a test but I am not that well-versed so I don't even know if what I wrote make too much sense...

    From the templates point of view, we could:
    - More work: capture the error and read the error log file. If we find "table not found" then we can put a message in the console to ask the user to retry.
    - Less work: document it (we should probably do this regardless)

Production build 0.71.5 2024