Hux-style hooks, proof of concept

Created on 22 June 2023, about 1 year ago
Updated 28 April 2024, about 2 months ago

Problem/Motivation

In 🌱 [META] Hooks via attributes on service methods (hux style) Active we proposed to provide a new way to implement hooks, using object methods with attributes.
A similar system already exists in contrib: https://drupal.org/project/hux

In this issue we will do proof-of-concept implementation and experiments.

Once this is done, we can split out real implementation issues that can then be developed and merged one by one.
E.g. perhaps there will be separate refactoring issues as preparation for the new system.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

📌 Task
Status

Postponed

Version

11.0 🔥

Component
Base 

Last updated 1 minute ago

Created by

🇩🇪Germany donquixote

Live updates comments and jobs are added and updated live.
  • Needs change record

    A change record needs to be drafted before an issue is committed. Note: Change records used to be called change notifications.

Sign in to follow issues

Comments & Activities

  • Issue created by @donquixote
  • last update about 1 year ago
    Custom Commands Failed
  • @donquixote opened merge request.
  • last update about 1 year ago
    Custom Commands Failed
  • last update about 1 year ago
    Custom Commands Failed
  • 🇩🇪Germany donquixote

    There are tests that hard depend on protected methods that I am going to remove...

    class ModuleHandlerTest extends UnitTestCase {
      [..]
      public function testHasImplementations() {
        $module_handler = $this->getMockBuilder(ModuleHandler::class)
          ->setConstructorArgs([$this->root, [], $this->cacheBackend])
          ->onlyMethods(['buildImplementationInfo'])
          ->getMock();
    

    It feels wrong that we are testing this.

  • last update about 1 year ago
    Custom Commands Failed
  • last update about 1 year ago
    Custom Commands Failed
  • last update about 1 year ago
    Custom Commands Failed
  • last update about 1 year ago
    Build Successful
  • last update about 1 year ago
    29,551 pass
  • last update about 1 year ago
    29,217 pass, 170 fail
  • last update about 1 year ago
    Custom Commands Failed
  • 🇩🇪Germany donquixote

    I have to say the Drupal\Tests\Core\Extension\ModuleHandlerTest is quite incomplete.
    If that test passes, but other tests fail due to problems in the ModuleHandler, it means that ModuleHandler is not fully covered by the unit test.

  • 5:35
    3:21
    Running
  • 57:13
    53:37
    Running
  • 🇩🇪Germany donquixote

    I wonder, is the signature of ModuleHandler::__construct() part of the API, in the sense that changing it is considered a BC break?
    If so, we would have to construct the HookMap object inside the constructor.
    (Also we might further split up or rename HookMap, it is just an intermediate state)

  • Status changed to Needs review about 1 year ago
  • 🇬🇧United Kingdom catch

    I wonder, is the signature of ModuleHandler::__construct() part of the API, in the sense that changing it is considered a BC break?

    No it's not. However we often do 'best effort' bc for constructor arguments when they're added or removed, since that is generally easy to do and makes things easier for contrib. i.e. adding the argument with a NULL default and then using Drupal::service() to set the property if it's not injected with an E_USER_DEPRECATED (but no test coverage for the bc case). Several examples of this in 10.1 and probably loads in 9.5.x

  • 🇩🇪Germany donquixote

    Lessons learned from 📌 Improve unit test coverage for ModuleHandler RTBC .
    The behavior of hook_module_implements_alter() for ->alter() with multiple hooks is quite interesting.
    E.g. for ->alter(['entity_view', 'node_view']).

    Almost all changes to implementations of hook_node_view() are ignored:
    - Removed implementations are re-added, if the function exists.
    - New implementations for fake modules are removed again, if the module does not exist in the module list.
    - Changed order is reverted.

    The only thing that is respected is if a new implementation is added in a custom "group" (e.g. mymodule.xyz.inc).

    I wonder if we should preserve this exact behavior.

  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    29,566 pass
  • 🇩🇪Germany donquixote

    I am including the tests from 📌 Improve unit test coverage for ModuleHandler RTBC .

    Some of them actually change with the new system, showing how the behavior changes.

    E.g., if you use hook_module_implements_alter() to remove an implementation from a hook_form_FORM_ID_alter(), then:
    - In the current version of Drupal, the implementation would be re-added if the same module also implements the base hook hook_form_alter().
    - In the new version, the implementation will remain removed.

    We can attempt to keep the old behavior, or we can consider this a "bug fix".
    Depends whether we consider this a BC break, or if we only support "basic" (= non-creative) usage of the hooks.

  • 🇬🇧United Kingdom catch

    I think it's OK to change the behaviour to what we think should actually happen - the results could already change if a module added a hook implementation or changed from a generic to more specific hook etc.

  • 🇺🇸United States smustgrave

    Just adding some tags.

    Feature sounds interesting though!

  • 🇩🇪Germany donquixote

    Interesting question:

    Currently we discover every hook when it is needed.
    This means some hooks are only discovered a long time after the cache rebuild, some never.
    This is because without a call to ModuleHandler->invokeAll(), we don't know for sure which part of a function is the hook name, and because get_all_functions() is expensive to work with.

    With attributes, the natural way is to discover all of the hooks at once.

    However, at some point we need to merge the procedural hooks with the method attribute hook implementations.

    There are two ways to do this:

    1. Keep a cache of the method attribute hook implementations. Merge with the procedural implementations on demand.
    2. Whenever we discover attribute method hooks, we already discover all procedural implementations for any hook that has attribute method implementations. Only for those hooks that only have procedural implementations, we do the discovery later.

    The latter would make things a bit easier, but it can also have some funny side effects, if:

    • Modules rely on custom includes before they invoke a hook. We know for a fact that this happens for ->invoke(), but does it happen for ->invokeAll()?
    • Some include files have funny side effects when included, or they contain broken code.
  • 🇬🇧United Kingdom catch

    Only for those hooks that only have procedural implementations, we do the discovery later.

    Not entirely sure on the difference between options #1 and #2. However in general, I think we would want to deprecate this as soon as possible - i.e. once we have attribute discovery for procedural hooks, we can deprecate procedural hooks without attributes, and trigger a deprecation notice when they're found. Then the upgrade path for modules is to add the attribute with no other code changes. This is the sort of thing we could do in a single major release cycle and doesn't imply deprecating procedural hooks in general (which we might not even need to do if there's no extra discovery cost), then we can drop the get_all_functions() in hopefully Drupal 11.

    One thought with the attribute hook discovery. Agreed it makes sense to do all at once, but another reason for the 'on-demand' hook cache building is that a lot of hook implementations are never invoked - i.e. say my module alters 15 forms for 15 different contrib modules, none of which are installed, we can discover those hooks, but they're dead code. Similar with something like hook_entity_presave() for config entities which are never updated on production outside deployments just before a cache clear. It's potentially a very big array to load on every page.

    So... we might want a two-tier cache, one with everything that's been discovered, and a runtime one using cache collector that holds the ones that have been invoked. We could even put the bigger one in cache.data and the smaller one in cache.bootstrap

  • 🇩🇪Germany donquixote

    once we have attribute discovery for procedural hooks

    I don't know if that is realistic.

    With classes and methods, we can rely on:

    • Services tagged with 'hook' (this is how hux does it).
    • (optional) PSR-4 directory scans to find classes that were not registered as services yet.
    • Reflection to get all methods in a class.
    • Class autoloading, which means we don't have to worry about whether a file is included or not.

    With procedural code, we have two options:

    If we want procedural discovery based on attributes, we need to:

    • Include all the files where we want to look for functions with attributes. We could take into account hook_hook_info() for that. We could also try to include the files for the additional groups added by hook_module_implements_alter(), which would make this somewhat circular.
      But actually we don't really need to, because once people put attributes on their functions, they can also put a require_once to make sure that the respective files are always included.
    • Use get_defined_functions(), and look for all functions with hook attributes.
      This means for every function we find we need to create new \ReflectionFunction($function), and call ->getAttributes(). This would also gives the opportunity to check ->getFileName(). But we don't need this step, if we expect all the files to already be included, and omit any implementations at call time that are not available.

    We could make a dedicated attribute for function hooks, where the module and function are implicit.
    This would expect that the function name is like $module . '_' . $hook, and the $module is the same as the first part of the file name.
    E.g. if the file is content_moderation.module or content_moderation.workflow.inc, then the $module is 'content_moderation' and the $hook is the remaining part of the function. If the function does not start with the module name, the use of that attribute is illegal.

    One thought with the attribute hook discovery. Agreed it makes sense to do all at once, but another reason for the 'on-demand' hook cache building is that a lot of hook implementations are never invoked - i.e. say my module alters 15 forms for 15 different contrib modules, none of which are installed, we can discover those hooks, but they're dead code. Similar with something like hook_entity_presave() for config entities which are never updated on production outside deployments just before a cache clear. It's potentially a very big array to load on every page.

    I imagine the size of the cache might be similar to that of the service container.
    With simple grep, I find 445 services, 1426 hook implementations, and 279 documented hooks (which include placeholder hooks) in core.
    (my version of core where I did this is slightly modified, but the order of magnitude should be ok)

    In a cache, with the new system, for each implementation we would store either a function name, or a service id + method name.
    And for each list of implementations for a specific hook, we need space for the hook name and some overhead.
    So each entry will likely be smaller than a service record.

    For the service container, we currently have an interesting optimization where we serialize each entry separately.
    But we still load all the serialized items into memory.

    So... we might want a two-tier cache, one with everything that's been discovered, and a runtime one using cache collector that holds the ones that have been invoked. We could even put the bigger one in cache.data and the smaller one in cache.bootstrap

    I like the idea.
    Perhaps we could even measure the frequency by which a hook is used?

  • 🇬🇧United Kingdom catch

    📌 Deprecate hook_hook_info() Active is open to deprecate hook_hook_info(), I think we could probably limit attribute scanning for hooks to .module or files that are required directly from there.

  • 🇩🇪Germany donquixote

    we could probably limit attribute scanning for hooks to .module or files that are required directly from there.

    The problem is there is no php function to get all functions from a specific file.
    So we can check ReflectionFunction->getFileName() to exclude some functions, but we still have to call get_defined_functions() and get a bunch of unrelated ones.

  • 🇩🇪Germany donquixote

    Another observation:

    ModuleHandler has methods like ->addModule(), and ->setModuleList(), which are called when new modules are installed.
    Currently, if ->invokeAll() is called after new modules are added in this way, the new implementations for the new module will be added.
    This is problematic, because at that point all the services are outdated.
    But, in my experiments, this never happened.

    Shortly after that, when modules are enabled, the container is replaced, and a new ModuleHandler is created for the new container.
    The old module handler becomes useless at this point.

    I wonder if any objects still reference and use the old module handler, or any objects from the old container, after a new container has been created as part of a cache rebuild.

  • 🇬🇧United Kingdom catch

    The problem is there is no php function to get all functions from a specific file.
    So we can check ReflectionFunction->getFileName() to exclude some functions, but we still have to call get_defined_functions() and get a bunch of unrelated ones.

    Ah sorry I get it now, so actually attributes on procedural functions is not really a step forwards from where we are now, we're just exchanging name matching with reflection but get_defined_functions() is still there. I now understand the difference with

    Keep a cache of the method attribute hook implementations. Merge with the procedural implementations on demand.

    - just treat the procedural implementations the same as they are now, and that feels like a simpler option then overall.

  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • Open on Drupal.org →
    Environment: PHP 8.2 & MySQL 8
    last update 12 months ago
    Not currently mergeable.
  • @donquixote opened merge request.
  • last update 12 months ago
    29,833 pass, 1 fail
  • Open on Drupal.org →
    Environment: PHP 8.2 & MySQL 8
    last update 12 months ago
    Not currently mergeable.
  • last update 12 months ago
    29,833 pass, 1 fail
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    29,834 pass
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Build Successful
  • 🇩🇪Germany donquixote

    I introduced some interesting tools for the tests.
    We have to decide if we keep (and improve) them, or if we find something equivalent in the outside world.

  • last update 12 months ago
    Build Successful
  • last update 12 months ago
    Build Successful
  • last update 12 months ago
    29,453 pass, 116 fail
  • 🇩🇪Germany donquixote

    Lesson learned: hook_theme() is special, because Drupal\Core\Theme\Registry::build() calls ->processExtension(), which invokes the function directly, instead of using a callback.
    This will need a separate issue to clean up.

  • last update 12 months ago
    29,455 pass, 114 fail
  • last update 12 months ago
    29,706 pass, 77 fail
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    Custom Commands Failed
  • last update 12 months ago
    29,759 pass, 4 fail
  • last update 12 months ago
    29,759 pass, 4 fail
  • 🇩🇪Germany donquixote

    Observation:
    CKEditor5PluginManagerTest only passes because of a bug.

    class CKEditor5PluginManagerTest extends KernelTestBase {
    [...]
      private function mockModuleInVfs(string $module_name, string $yaml, array $additional_files = []): ContainerInterface {
    [...]
        $container = new ContainerBuilder(new FrozenParameterBag([
          'app.root' => $root,
          'container.modules' => [
            $module_name => [
              'type' => 'module',
              'pathname' => "modules/$module_name/$module_name.info.yml",
              'filename' => NULL,
            ] + $this->container->getParameter('container.modules'),
          ],
          'container.namespaces' => [
            "Drupal\\$module_name" => vfsStream::url("root/$site_directory/modules/$module_name/src"),
          ] + $this->container->getParameter('container.namespaces'),
        ] + $this->container->getParameterBag()->all()));
    

    This leads to an array where the other modules are nested inside the info of the module being tested:

    {
        "ckeditor5_invalid_plugin": {
            "type": "module",
            "pathname": "modules\/ckeditor5_invalid_plugin\/ckeditor5_invalid_plugin.info.yml",
            "filename": null,
            "mysql": { ... },
            "system": { .... },
            "user": { ...},
            "filter": { ...},
            [...]
        }
    }
    

    If we fix this, we run into the assertion error from Extension::__construct() which calls file_exists() on the files.
    At the time this happens, $root is a vfs:// path, and this virtual directory only contains the custom module being tested, but not all the other modules in the list.

    This bug raises its head when I convert the module_handler service to autowire, due to the order in which compiler passes run.
    It was quite the adventure.

  • Open on Drupal.org →
    Environment: PHP 8.2 & MySQL 8
    last update 12 months ago
    Not currently mergeable.
  • last update 12 months ago
    29,832 pass, 1 fail
  • 7:14
    3:38
    Running
  • 7:06
    3:30
    Running
  • 0:14
    59:23
    Running
  • 0:12
    59:23
    Running
  • 🇩🇪Germany donquixote

    Another observation.
    Drupal\Tests\system\Kernel\Extension\ModuleHandlerTest was doing illegal stuff..

        // Test the fixed list feature.
        $fixed_list = [
          'system' => 'core/modules/system/system.module',
          'menu' => 'core/modules/menu/menu.module',
        ];
        $this->moduleHandler()->setModuleList($fixed_list);
    

    This is not right, because the signature is

      /**
       * Sets an explicit list of currently active modules.
       *
       * @param \Drupal\Core\Extension\Extension[] $module_list
       *   An associative array whose keys are the names of the modules and whose
       *   values are Extension objects.
       */
      public function setModuleList(array $module_list = []);
    

    So it expects Extension objects, not strings.
    The only reason this worked is because the type of these items is never checked.

    This also reveals another issue:
    'menu' module does not exist anymore, only 'menu_ui'.
    See https://git.drupalcode.org/issue/drupal-3368812/-/commit/cc469d7470582e1...

  • 🇩🇪Germany donquixote

    Another note.
    I don't like that Drupal\Core\Extension\Extension object knows the Drupal root, can "load" itself (= include the .module file), and does the file_exists() in the constructor. I would like it much better as a "dumb" value object, that does not do anything by itself.

    Now we have two classes responsible for loading files: The ModuleHandler (to load additional include files) and the Extension object for the main *.module file. To alter the way this loading works, we would have to modify two places, at least.

    E.g. for the tests with vfs:// or for some tricks with "virtual" modules, it would be nice to use a different root for some modules..

  • 🇩🇪Germany donquixote

    Review time!

    Keep in mind that this is a POC branch, and we don't need to merge everything at once.
    Instead, we can create spin-off issues that apply these changes piecemeal, with room for discussion on each one.

  • 🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia
  • last update 12 months ago
    29,842 pass
  • 🇩🇪Germany donquixote

    In the MR I introduced some events and some cache tags, that replace direct calls, so we no longer need to pollute interfaces with reset methods.
    However, I am not really happy with the current shape of this.

    Scenarios to support:

    • A cache rebuild.
      When this happens, we want to clear or invalidate all persistent caches related to hooks.
      We do not necessarily need to reset object property caches, because these objects are thrown away when we get the new container.
    • ModuleHandler->addModule(), ->addProfile(), ->setModuleList().
      (Ideally I would like to get rid of these completely, but that's probably a BC break we don't want in 10.x)
      When this happens, we want to:
      • Clear or invalidate all persistent caches related to hooks.
      • Reset all object property caches related to hooks.
      • Allow other subscribers to reset caches that are dependent on the module list, but that have nothing to do with hooks. One example could be the theme registry with the preprocess functions.
    • ModuleHandler->resetImplementations().
      This is a leftover, and should not be called directly anymore, because we should use events and cache tags instead.
      However, it still needs to be supported for BC (I think).
      When it happens, we want to:
      • Clear or invalidate all persistent caches related to hooks.
      • Reset all object property caches related to hooks.

      So this is more narrow than the scenario above.

    • Request terminate event.
      When this happens, we want to write newly discovered hook implementations to the cache.

    I think I actually drop the cache tags idea and go completely with events.

  • 🇩🇪Germany donquixote

    Maybe I should explain a bit the architecture in the MR as it is now.

    there are 3 different classes/interfaces that each contain a list of implementations, but in a different way, serving a different role in the system.

    ImplementationListBuilder:
    List of implementations used during discovery.
    The implementations are in structured array, with meta information for weights and before/after.
    The class has methods to add implementations, to merge other lists, to apply a hook_module_implements_alter() etc.

    CompactImplementationListInterface
    Fixed list of implementations that can be serialized and cached.
    The implementations are already in the order in which they should be executed, but there is no meta information for weights or before/after.
    Callbacks are not stored as closures, but as strings and arrays.
    The interface has methods to create actual callback lists. These methods require a container and a module loader.
    So, on unserialize, dependencies are not pulled from Drupal::container(), instead, the container and module loader are explicitly passed to the method.
    When a callback list is requested, this object will include all the include files, get all the services from the container, create closures for all implementations, and discard implementations where the service or the function or method does not exist.

    HookImplementationCallbackListInterface
    Fixed list of implementations as closures, meant for actually invoking the hook.
    It also has a module name per callback, that will be used in ->invokeAllWith().
    These lists can _not_ be serialized, because of the closures.
    The interface has methods to get the callbacks or to invoke all. We might want to remove some of these methods which are currently not used.

    SingleModuleCallbackListInterface:
    Similar to the above, just only with the closures for a single module.
    This one is a bit tricky, because it becomes obsolete when a new file is included that contains a $module . '_' . $hook function that was not known at discovery time.

    Special implementations for "empty list"
    There are some specialized implementations of the above interfaces for "empty list" or "single implementation".
    The only purpose is performance: If we know the list is empty, we can skip some of the checks we would otherwise do.
    Perhaps this is overkill, and we will remove it.

  • last update 11 months ago
    29,820 pass, 1 fail
  • last update 11 months ago
    Build Successful
  • last update 11 months ago
    29,850 pass, 1 fail
  • last update 11 months ago
    29,850 pass, 1 fail
  • 🇺🇸United States smustgrave

    For what it's worth I think this would be very neat to get in to 10.2 or maybe it's an actual addition for D11 to not be included in 10?

  • 🇩🇪Germany donquixote

    Yes, I want this in 10.x :)

    We need to evaluate the BC breaks once we agree on the general direction, and then we can see how far we can mitigate them.

    With this and 📌 Replace ContainerAwareEventDispatcher with Symfony EventDispatcher Fixed , we will have true competition of events vs hooks, and we no longer need to reject one of them for the wrong reasons. It would be really useful to have this before D11, so that people can upgrade their modules, and we can learn some lessons before D11.

  • 🇩🇪Germany donquixote

    Kernel tests for hook system, as preparation:
    📌 Add kernel tests for hook system Needs work

  • Status changed to Needs work 10 months ago
  • The Needs Review Queue Bot tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • 🇺🇸United States EclipseGc

    I added relevant commentary here: https://www.drupal.org/project/drupal/issues/3366083#comment-15549894 🌱 [META] Hooks via attributes on service methods (hux style) Active

  • Status changed to Postponed about 2 months ago
  • 🇬🇧United Kingdom catch

    Postponing this on 📌 OOP hooks using event dispatcher Needs review which has a similar API but much simpler bc layer.

Production build 0.69.0 2024