Installation fails if cron runs in parallel

Created on 21 January 2025, about 15 hours ago

Problem/Motivation

My Drupal CMS instance failed installation with a "Temporarily Unavailable" message displayed on the homepage. Upon reviewing the logs, I encountered the following error:
Drupal\Component\Plugin\Exception\PluginNotFoundException: The "node" entity type does not exist. in Drupal\Core\Entity\EntityTypeManager->getDefinition() (line 139 of core/lib/Drupal/Core/Entity/EntityTypeManager.php).
I attempted to reproduce this issue on a new Drupal CMS instance, but it did not occur. Comparing the logs of both instances, I noticed that the failing instance executed a cron job during the installation process.

The cron job was triggered by a JavaScript load, which appears to be linked to changes introduced in 📌 Search must be indexed after recipe is applied Active . Specifically, the ECA event included the following description:

# By resetting the last cron run time, this forces the `content` search index
# to be rebuilt when it is created, because the next page request will run
# cron via the automated_cron module. This can be removed when there is a
# config action to rebuild a search index, or some other way for a recipe to
# trigger a cron run.

In summary, if the cron job runs in parallel with the Drupal CMS installation, it causes the process to fail.

Steps to reproduce

  1. Run the following command in the terminal: ddev drush sql:drop --yes && ddev launch
  2. Select some recipes and start the installation.
  3. During the installation process, run: ddev drush cron

Proposed resolution

The exact root cause remains unclear. I attempted to prevent the cron job from running during installation by introducing a new state variable (installation_completed) to the install_finished function. The variable is used to control whether the cron job should execute. However, this did not resolve the issue.
Code snippet of the attempted fix:

public function run() {
    // We can only run cron if the installation has been completed.
    if (!$this->state->get('installation_completed')) {
      $this->logger->warning('Attempting to run cron while the website is not fully installed.');
      return FALSE;
    }

    // Allow execution to continue even if the request gets cancelled.
    @ignore_user_abort(TRUE);
🐛 Bug report
Status

Active

Component

General

Created by

🇧🇷Brazil hfernandes

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @hfernandes
  • 🇩🇪Germany jurgenhaas Gottmadingen

    Oh, this is a great observation, and yes, we've implemented that tweak to run cron after applying the search recipe. Now, preventing that cron run before the installation is complete may solve the current issue but prevent the initial search indexing.

    I wonder, if preventing the automatic cron being triggered on ajax requests would be a good way forward?

  • 🇺🇸United States phenaproxima Massachusetts

    I can't imagine that we want to trigger the cron run on Ajax requests, or during installation, so if we can adjust the ECA model to account for those situations, it seems like a worthwhile bug fix to me.

  • 🇩🇪Germany jurgenhaas Gottmadingen

    Running cron at the end of an Ajax request may actually be intentional, not sure, though. My suggestion in #2 was to turn that of, without knowing if that would introduce further unwanted side-effects.

    Running cron during installation (or right after it) is exactly what was intended by 📌 Search must be indexed after recipe is applied Active as the only way to trigger the initial indexing when the search recipe gets applied.

    Therefore, my conclusion is that we want the setup as is, but make sure that the cron doesn't run too early. But it should be running when the search recipe was enabled during installation.

  • 🇺🇸United States phenaproxima Massachusetts

    Okay - in that case, I guess ECA needs to be updated to react to RecipeAppliedEvent. :)

  • 🇩🇪Germany jurgenhaas Gottmadingen

    That would lead to the same behaviour. Which ever event we'll listen to, the action taken is to reset the last cron run so that the automatic cron module will run cron at the end of the next http request. And that may be too early, e.g. for when an ajax request from the installer comes in while the installer in in-between certain tasks.

  • 🇧🇷Brazil hfernandes

    I'm not very familiar with ECA module, but can it take a state variable into account?
    If it can, you might consider adding an additional validation step that checks whether the install_time state variable has been set. This would ensure the cron job does not run during the installation process.

  • 🇩🇪Germany jurgenhaas Gottmadingen

    Well, it can, but that's not solving the problem. We do want the cron to run at the end of the next request, that's the only purpose of this. So, every attempt trying to prevent that from happening would break the functionality and leave the search index uninitialized.

    From the IS it appears that if the cron, which we want, is triggered at the end of an ajax request, it may come too early. That's why I'm suggesting that the automatic update module should not execute cron on an ajax request but only at the end of regular requests.

  • 🇧🇷Brazil hfernandes

    Also, if I understand correctly, the issue in 📌 Search must be indexed after recipe is applied Active would only occur if you have a running website and then install the search recipe, as the install_finished function triggers the cron job at the end of the installation process.

  • 🇩🇪Germany jurgenhaas Gottmadingen

    as the install_finished function triggers the cron job at the end of the installation process

    That's a thing, I didn't realize that before. A comment in that function even states:

    // Will also trigger indexing of profile-supplied content or feeds.
    

    So, we need that cron reset if the search recipe gets applied afterwards, but not if it was selected during the initial site installer. I'll have a look that I can improve the ECA model accordingly.

  • 🇬🇧United Kingdom catch

    So, we need that cron reset if the search recipe gets applied afterwards, but not if it was selected during the initial site installer.

    Yes this ought to be enough.

    In 11.1, cron runs at the end of the installer (didn't confirm that Drupal CMS installer definitely calls that same code but seems likely), in 11.2 we removed the hard-coded cron run, but it's executed on the first 'real' request after install (with core's installer, this is the last page of the installer) via automated cron.

    Having said all this, cron runs should not be executing AJAX requests against the actual site at all, since not just cron but many other things could go wrong @hfernandes can you check which page might have triggered this?

  • 🇧🇷Brazil hfernandes

    @catch, the cron was triggered during one of the steps of the Drupal CMS installation process. I’m not entirely sure what action caused this call to be made in the middle of the process—perhaps something like a window resize that lazy-loaded an asset?

    It’s worth noting that the cron was triggered by a JS file load. Starting from Drupal 10.1, JS and CSS file loads are handled by two specific routes—system.css_asset and system.js_asset (reference: https://www.drupal.org/node/3358091 ). Should we consider preventing the automated_cron from being triggered if the route is one of these two?

    in 11.2 we removed the hard-coded cron run, but it's executed on the first 'real' request after install (with core's installer, this is the last page of the installer) via automated cron

    Could you share the issue number for that change? Additionally, are we considering that the automated_cron module isn’t included in some installation profiles, such as minimal?

  • 🇬🇧United Kingdom catch

    📌 Remove the automatic cron run from the installer Needs work is the installer issue.

    Additionally, are we considering that the automated_cron module isn’t included in some installation profiles, such as minimal?

    Yes it was considered. If you install minimal, you can install automated_cron, or set up a cron job to run from the cli, the only consequence of not immediately running cron is that you get a reminder in the status report to set it up.

    Starting from Drupal 10.1, JS and CSS file loads are handled by two specific routes—system.css_asset and system.js_asset (reference: https://www.drupal.org/node/3358091 ). Should we consider preventing the automated_cron from being triggered if the route is one of these two?

    There is no particular reason to avoid running cron on those requests, it happens after the content is served, same as any other request.

    The installer itself should not be requesting aggregated assets, so there must still have been an initial AJAX request to some path, which in turn loaded js/css, which then ended up requesting those routes - e.g. this situation should only happen when the installer calls out to the not-yet-installed site. However this is based on core's installer, it could be that the Drupal CMS installer is itself doing something that could be causing this, or that a contrib module is.

    It would not necessarily be a bad idea to limit automated cron to HTML GET responses in general, but when it was added it was intentionally wide-ranging to it would have the maximum chance of running frequently - e.g. if core page caching is on and a site has very little auth traffic, then potentially only contact form or search submissions might trigger it for days. Would not hurt to open a core issue though.

Production build 0.71.5 2024