ModuleHandler skips all hook implementations when invoked before the module files have been loaded

Created on 8 April 2021, about 3 years ago
Updated 20 June 2024, 5 days ago

Problem/Motivation

Under certain circumstances the module files are not yet loaded.

This happens for example when invoking a hook inside the constructor of a http_middleware service; theses services are constructed very early as a dependency of http_kernel service.

A more concrete example is a middleware service using the entity_type.manager. Most of the times the entity type information is retrieved from cache (stored in the discovery cache bin). When this cache however is missing, hooks like hook_entity_type_build() and hook_entity_type_alter() need to be invoked at this early stage. The modules files however haven't been loaded yet, making ModuleHanlder::verifyImplementations() skip all the hook implementations (silently!). The entity definition cache is then incomplete resulting in various hard to debug (next to impossible) exceptions like ๐Ÿ› The "paragraph" entity type did not specify a translation handler. Closed: won't fix .

Steps to reproduce

Taken from this comment ๐Ÿ’ฌ Route "entity.path_alias.collection" does not exist Fixed .

  1. Just uninstalled the dblog module from UI here /admin/modules/uninstall
  2. Then open any menu /admin/structure/menu/manage/[menu-name] and I see an error:
    Uncaught PHP Exception Drupal\Component\Plugin\Exception\InvalidPluginDefinitionException: "The "menu" entity type did not specify a "edit" form class." at /core/lib/Drupal/Core/Entity/EntityTypeManager.php line 211
  3. Then click to Flush views cache on another admin page (Flush all caches > Flush views cache via admin menu).
  4. After #3 I can see the error:
    Uncaught PHP Exception Symfony\Component\Routing\Exception\RouteNotFoundException: "Route "entity.path_alias.collection" does not exist." at /core/lib/Drupal/Core/Routing/RouteProvider.php line 206

Why does it happen?

After #3, a lot of items were removed from router table from the database. You have to launch drush cr twice to fix such error.

Proposed resolution

If a hook is invoked before the modules being loaded, load all modules.

Remaining tasks

Update issue summary
Address #12 and #17

๐Ÿ› Bug report
Status

Needs work

Version

11.0 ๐Ÿ”ฅ

Component
Bootstrapย  โ†’

Last updated 5 days ago

No maintainer
Created by

๐Ÿ‡ณ๐Ÿ‡ฑNetherlands casey

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Status changed to Needs work over 1 year ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Issue summary should be updated with steps to reproduce.

    Added addressing #12 and #17 to remaining tasks.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    patch 15 had fuzz, removing fuzz to be able to try testing with newer versions.

    straight up reroll, no interdiff, just fuzz removal.

  • ๐Ÿ‡ธ๐Ÿ‡ฎSlovenia joco_sp

    #27 worked for me also on core 9.5.5.
    Thank you all for making that patch, because it solved a big issue :D If I see the error reappearing, I'll report it here.

  • Was unable to add a new view suddenly for some reason. Patch #27 fixed the issue.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom alexpott ๐Ÿ‡ช๐Ÿ‡บ๐ŸŒ

    This is a important patch that makes the ModuleHandler more robust. Looking at the code I'm pretty sure that we do not need to call loadAll() from buildImplementationInfo() because we already do a reload() in buildHookInfo(). I'm pretty use we only need the call in verifyImplementations(). I think we should be adding a middleware test that calls a hook prior to the regular loadAll() call from \Drupal\Core\DrupalKernel::preHandle().

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom AdamPS

    FYI I believe this caused ๐Ÿ› Caching bug enabling/disabling overrides Fixed . I found a fix which is to check ModuleHandlerInterface::isLoaded(), and skip any further processing if FALSE.

    So yes this would be really useful to fix as a priority. As already noted, the resulting problems are pretty tricky to debug. The symptoms (to "randomly" skip calling hooks) are quite severe, potentially even security related.

  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland Berdir Switzerland

    I'm unsure if we should support this. Yes, it makes it more robust, but at the cost of possibly unpredictable performance issues? Forcing modules to load in a middleware will basically eliminate the reason middlewares exist in the first place.. to have very fast pre-full-bootstrap logic, like page cache.

    What about throwing an exception instead if we detect this (trying to invoke a hook before a full bootstrap), so it fails consistently and reliably when adding such a hook? If you do that then you know that you have to implement things differently. And if you really really want to do this, you could still call loadAll manually? For BC, we could add a deprecation message with the loadAll() fix as an intermediate step?

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom AdamPS

    I'm unsure if we should support this.

    Good point. The currently behaviour is: run no hooks and cache the result. This seems clearly undesirable, however it's a good idea to think carefully about what is right.

    What about throwing an exception instead if we detect this

    We could try it, however It's pretty easy to trigger a hook - the call to invoke the hook could be buried within a function that we want to call. There might be places in existing middleware code that trigger a hook as a side-effect of their function. Or, like #32, places in ordinary module code that can "unexpectedly" run early in the bootstrap cycle.

    We might need a way to disable the exception, i.e. run no hooks and don't cache the result.

  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland Berdir Switzerland

    Not running the hooks does not work IMHO. We don't have a way to pass that information back, and caching happens elsewhere. One of the most common issues related to this is incomplete entity type definitions because build/alter hooks weren't invoked. Things will be completely unpredictable then. For example, information about bundle classes might be missing and then it's not the class you'd expect to get back in your code.

    And yes, hooks are typically called several steps further down, for example the entity type thing, if you load a node in a middleware for example. But an exception will fail right there and then and the stacktrace will tell you exactly which middleware and line of code is the problem. Unlike now, where it will suddenly blow up in a completely different place, for example when editing a node with paragraphs.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom AdamPS

    @Berdir, yes thanks it makes sense in general. However it seems to imply that middleware cannot do anything that triggers hooks (such as to load a node) because that would trigger the exception. Are you suggesting that this becomes a strict limitation of middleware? It could imply even that adding a new hook anytime in the future would be a BC-break because middleware might be using the code where the hook is being added. And when we say middleware, we actually mean anything that might run "early", for example implementations of ConfigFactoryOverrideInterface.

    I expect that we would want a function that code can use to check if it is yet safe to use hooks.

    So I was wondering if this might be too strict, and we could allow middleware to call a function to disable all hook exceptions (run no hooks and don't cache the result), or even just for a specific hook. However I can see that could be problematic because disabling some hooks could even have security implications.

  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland Berdir Switzerland

    There's no such thing as only loading one/some hooks, what's this about is loading the .module files of all modules so that hook discovery can figure out if a certain function exists or not.

    And yes, middlewares should absolutely be as fast and slim as possible. They are specifically designed to run in very-early-bootstrap and use the decorator pattern, so they're always invoked including their dependencies. You should only use a middleware if there is a benefit to running before modules are loaded and the general bootstrap happens. For example page cache and the IP ban module, both benefit from being able to return a response as quickly as possible to achieve fast response times and minimal overhead if a certain IP attempts to send a massive amount of requests. If it doesn't need to be that fast or if you have quite a few dependencies, use a request event subscriber.

    It's like hook_boot() in D7, that also only supported a limited set of API's.

    For example, instead of using the config system in a middleware, you could dynamically set a parameter on it with a service provider, or only support configuration through container parameters or settings in the first place instead of config with a UI.

  • We had been running into an issue for some time on one of our Drupal instances where performing a cache-rebuild (via Drush) while the site was serving web traffic would result in a corrupt module_implements cache.

    I managed to trace the problem down to something weird happening in Shield and chanced across https://www.drupal.org/project/shield/issues/3277210 ๐Ÿ› Shield middleware invokes hooks before modules are loaded, corrupting module_implements cache Needs work . Although the patch in that issue did resolve the cache corruption, we ultimately ended up going with https://www.drupal.org/files/issues/2023-03-10/3207813-27.patch โ†’ which addresses the root cause of the issue rather than patching Shield.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States AndySipple

    I was directed to this issue from https://www.drupal.org/project/pathauto/issues/3254939 ๐Ÿ’ฌ Route "entity.path_alias.collection" does not exist Fixed

    Symfony\Component\Routing\Exception\RouteNotFoundException: An exception of type 'Symfony\Component\Routing\Exception\RouteNotFoundException' with the message 'Route "entity.path_alias.collection" does not exist.' was noticed in /code/web/core/lib/Drupal/Core/Routing/RouteProvider.php on line 206
    in Drupal\Core\Routing\RouteProvider::getRouteByName called at /code/web/core/lib/Drupal/Core/Routing/UrlGenerator.php (line 432)
    in Drupal\Core\Routing\UrlGenerator::getRoute called at /code/web/core/lib/Drupal/Core/Routing/UrlGenerator.php (line 129)

    We've been grappling with this error for some time now. It would trigger the White Screen of Death (WSOD) after deploying changes to the Pantheon live site using Redis or when clearing the cache. Although it seems random, this error occurs most consistently in our high-traffic live site.

    We applied the patch mentioned in #27 ๐Ÿ› ModuleHandler skips all hook implementations when invoked before the module files have been loaded Needs work .

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States Fernando Iglesias

    Fernando Iglesias โ†’ made their first commit to this issueโ€™s fork.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States jeffreysmattson

    After three weeks of banging our heads against the wall. With seemingly random site crashes that occurred only in production. With errors like this:

    notice Uncaught PHP Exception Drupal\Core\Field\FieldException: "Attempted to create an instance 
    of field with name field_last_password_reset on entity type user when the field storage does not exist." 
    at {redacted}docroot/core/modules/field/src/Entity/FieldConfig.php line 315 

    These errors were not only for the field_last_password_reset. They would error on all kinds of different fields that didn't have field storage, but actually did have field storage.

    The site would become unavailable and a simple cache clear would resolve it... until a few hours later it would go down again.

    This fixed it. Thank you!

    I used #15 because we are still on Drupal 9.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    re: #41,
    I haven't seen "field storage doesn't exist" in relation to password_policy. I've seen other issues such hook password policy updates that break on anything other than mysql.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    Still needing steps to reproduce. So far I see mention of the passwird_policy module in relation to this issue.

    It would be nice to simplify the steps to reproduce and zero in on the critical path here.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    @joseph.olstad password policy is just an example in #41, not the cause of the issue or the only symptom.

    This bug gets triggered (usually) when a middleware (which runs before modules are loaded) ends up invoking hooks, this is pretty well documented as the issue and there are lots of code paths that can trigger it. I still think ๐Ÿ“Œ Only load all modules when a hook gets invoked Needs review is closer to the correct solution here - i.e. always load modules when hooks are invoked and don't load them until they are.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    ok thanks for that, ya the other approach to this issue at
    ๐Ÿ“Œ Only load all modules when a hook gets invoked Needs review
    does sound better!

  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland Berdir Switzerland

    > I still think #1905334: Only load all modules when a hook gets invoked is closer to the correct solution here - i.e. always load modules when hooks are invoked and don't load them until they are.

    The concern I have with that issue is that it becomes very unpredictable. Yes, triggering a hook in a middleware will work, but it will also cause a considerable performance regression for sites that happen to use such a module, without any indication about that unless they do some profiling.

    The current cases likely only trigger such hooks in edge cases or they would consistently fail and not work at all. But with this, you could for example load an entity in a middleware. Middlewares should be as slim as possible with as few dependencies as possible otherwise you're better off doing a request subscriber.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    But with this, you could for example load an entity in a middleware.

    I think that might be what's happening with some of these reports - loading an entity, or especially config entity, that's usually cached, then suddenly it's not cached and the middleware loads it, triggering the bug.

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands casey

    I think I agree with Berdir that the right solution is to trigger an exception.

  • last update 8 months ago
    30,426 pass
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia rakshith.thotada

    I see if this is cache issue during middleware service request -

    • I am trying to throw an exception in that case and in exception handler -> Get the active definition which has the last successful event response.
    • Further if still it does not respond - throw the usual InvalidPluginDefinitionException
  • Status changed to Needs review 7 months ago
  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia rakshith.thotada

    Adding patch for 10.1.x version.

  • last update 7 months ago
    Build Successful
  • Status changed to Needs work 7 months ago
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States smustgrave

    Was previously tagged for issue summary which still needs to happen.

    Current development branch is 11.x

    If you are going to upload patches please include interdiffs

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia nitesh624 Ranchi, India

    Is there any patch available for drupal 10.0.x release? We are also running into same problem

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia nitesh624 Ranchi, India

    I tried patch from #27 but did not resolve issue

  • I'm using patch #27 on Drupal 10.1. At this point, I don't remember why I'm using it, but everything seems to work.

  • ๐Ÿ‡ฎ๐Ÿ‡ณIndia nitesh624 Ranchi, India

    For me this failing during blt sync command. Which actually pull down the production database to local. and it runs couple of drush command
    drush sql-sync-> this ran fine without any issue
    drush cr -> Here the error came "The "paragraph" entity type did not specify a translation handler."

    We have the multilingual enable on site and paragraph field also available on node, which allowed to be translated via content_translation module

  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Environment: PHP 8.1 & MariaDB 10.3.22
    last update 5 months ago
    29,722 pass
  • Open in Jenkins โ†’ Open on Drupal.org โ†’
    Environment: PHP 8.1 & MySQL 5.7 updated deps
    last update 5 months ago
    Custom Commands Failed
  • ๐Ÿ‡บ๐Ÿ‡ธUnited States bobthebuilder

    Drupal 10.2.3. Paragraphs 8.x-1.17. Paragraphs Library 8.x-1.17.
    I'm still receiving the error message when trying to create a Paragraph. Patch #27 does not fix the issue and Patch #50 can no longer be applied with Drupal 10.2.3. Patch #50 did work for me pre-Drupal 10.2.3.

  • This issue needs a MR to be created instead of using patch files.

  • ๐Ÿ‡จ๐Ÿ‡ฆCanada joseph.olstad

    As discussed in #44 through #47, prioritize effort to 1905334 instead.
    ๐Ÿ“Œ Only load all modules when a hook gets invoked Needs review

  • Merge request !6976Add changes from patch 27. โ†’ (Open) created by solideogloria
  • I created a MR from #27, since that's what most people said is working. #50 and #51 seems to do something entirely different, and it also doesn't have any tests.

  • Pipeline finished with Success
    4 months ago
    Total: 486s
    #114891
  • ๐Ÿ‡ซ๐Ÿ‡ทFrance quimic Paris

    Patch #27 can no longer be applied on Drupal 10.2.4

  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands jsst

    For us too, patch #15 brought an end to a long running terror bug, causing spontaneous crashes due to this issue. Our site was running fine for five years, but somewhere during the upgrade to 10.2 the random cache corruption started to hit. Our http middlewares don't do anything special, something in Drupal Core made this bug worse. The bug is hard to debug because it seems to depend on concurrent usage of the site and clearing the cache can fix the bug (after running it up to 10 times) but also cause it.

    So we have a major bug that is hard to debug and causes a site to go completely offline, with an unreliable fix (drush cr until it works). Seems to me we should be merging #15 in Drupal 10.2 as soon as possible.

  • Well, we shouldn't be merging a patch at all. We should merge the merge request.

  • Status changed to Needs review about 2 months ago
  • I added a steps to reproduce in the issue summary that I copied from a comment. The comment said that the patches here fixed the issue, so it's at least one example.

    However, there may be more ways to reproduce the problem, and the example I added requires another module. It'd be helpful if anyone can reliably reproduce the problem in another way.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States nicxvan

    Should we be using the patch in 27 as the base? I think it seems the suggestion was to throw an exception if invoking hooks in middleware?

  • An exception is already thrown and is the cause of the issue in the first place. How would throwing an exception fix the issue? Wouldn't it just break in a different place?

    The patch the MR is based on fixes the issue because the information needed is already there, and it is helpful for anyone patching their code, like me.

    If you want to change the approach, you'll need to open a new MR.

  • ๐Ÿ‡บ๐Ÿ‡ธUnited States nicxvan

    I'm not sure to be honest, I was reading comments 33+ which came after the patch in 27 saying that the solution in 27 could have unpredictable performance issues and the exception description.

    I know the later patches don't have tests, but I think the concerns around comment 33 still need to be addressed.

  • Status changed to Needs work 5 days ago
  • The Needs Review Queue Bot โ†’ tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide โ†’ to find step-by-step guides for working with issues.

Production build 0.69.0 2024