Account created on 18 March 2009, over 16 years ago
#

Merge Requests

More

Recent comments

🇩🇪Germany donquixote

That keyvalue solution would not be needed if we could somehow store the container in a way that allows partial loading.
So, on bootstrap we would load a small chunk of the container definition, and then we lazily load more chunks as needed.

I think we are still far away from that, and certainly won't solve it in this issue, but it could be a reason to wait with that keyvalue solution.

And another thing:
Currently with that keyvalue + cache solution, we still load the complete raw lists for all hooks, as soon as one single hook is invoked.
If we are concerned about memory, we could consider separate caches.

I need to do some profiling myself, when I have time.

Just to clarify: The main reason for keyvalue is to reduce the size of the container definition?
Or are there other benefits?

🇩🇪Germany donquixote

Another thing to consider is whether we should split some of this logic into a separate service, something like HookHandler, or HookImplementationStore, to reduce complexity and stabilize the constructor signature in ModuleHandler.

This idea is attractive at first, but then leads to the next question, should the methods like ->invokeAll() stay on ModuleHandler, or should they all go into that new class, or perhaps in yet another new class, so that we would have ModuleHandler, HookHandler, and HookImplementationStore?

Having a separate service only to load ImplementationList objects can be really nice for decorating and mocking.
This service should therefor not have the public methods like ->invokeAll() and ->alter().

With that in mind, I would absolutely keep these public methods in ModuleHandler for now.

🇩🇪Germany donquixote

Hello, just a quick summary of how I understand the new direction proposed by @berdir:

Same as upstream:

  • We register a service per class with hook implementations.
    These services remain public (which is needed for CallableResolver/ClassResolver).

Same as earlier MRs:

  • We introduce ImplementationList as proposed here and in the other issue.
  • We drop ProceduralCall, as already proposed here.
  • We no longer use EventDispatcher for hooks.

New:

  • We use CallableResolver to resolve implementation identifiers to services and methods.
    This way we avoid having to inject a bare container in ModuleHandler, and we avoid the service closures.
  • We use keyvalue to store implementation lists and metadata which would otherwise be stored in the container.
    This way we avoid storing any hook implementation lists or hook system metadata in the container.

In fact we could deliver this as 3 separate changes:

  1. Introduce ImplementationList in 📌 [pp-1] Explore adding an ImplementationList object for hook implmentations Active
  2. Drop EventDispatcher for hooks, pass CallableResolver and a raw `$hook_lists` parameter to ModuleHandler::__construct() (and the include files and order ops).
  3. Use keyvalue store instead of container parameters for hook lists and metadata.

I could see each of these steps as an incremental improvement.
And it would allow us to revert the last step, without reverting earlier steps.

It would be interesting to see the container size in that second step.
Would these raw hook lists with `[$hook => ["$class::$method" => $module]]` be smaller than what we currently write to the container for event dispatcher, or what we would add with service closures?

That last step with keyvalue seems a bit spooky to me, although it should work in regular scenarios.
What concerns me is I am not sure if the life cycle of a container is always aligned with what we store in the database:

  • We can have two ongoing requests, where one is still operating with an old container, and another is already on a new container.
    This could be due to high traffic or due to a long-running operation in that first request.
    It would be mitigated if people use maintenance mode.
  • Within the same request or process, we could have an operation that is using services from the old container, which triggers a container rebuild in a subroutine, and then continues its own operation with services from the old container.
    I am not sure if this ever happens or where.

This is not new, and I think there are other things that would already cause the same kind of problem.
E.g. an operation uses services from an outdated container, while entity tables in the database have their schema based on a new container.
So it might not be an issue we should worry about.
My concerns could be unfounded, and it would be quite hard to craft a realistic scenario.

🇩🇪Germany donquixote

We could detect if modules have a hook_install(), and if so, put them in separate groups.

I found that:

  • Just checking for hook_install() might not be enough.
    There is also hook_module_preinstall() and hook_modules_installed().
    And then there could be funny side effects if e.g. plugin derivers depend on configuration.
  • With the current implementation, just "put them in separate groups" does not really help if a configuration is created in hook_install().
    Because $config_installer->checkConfigurationToInstall('module', $module); is called for all modules in the initial list, not only for those in the group.

Therefore what I proposed cannot be a sufficient solution.
My original idea was to auto-detect which module needs to be in a separate group. But that might just not be viable, we might have to stick with that "container_rebuild_required", and tweak the order of operations in ModuleInstaller.

🇩🇪Germany donquixote

Based on some experiments, the following alternative solution would also do the job, and allow installation in one batch:
(I write this in the context of 🐛 hook_install() may not work as intended, if modules installed in bulk Active )

  • In Drupal core, in ModuleInstaller::doInstall(), call \Drupal::getContainer()->get('plugin.cache_clearer')->clearCachedDefinitions() _before_ invoking hook_module_preinstall().
    (we could also call this directly from our own contact_storage_module_preinstall())
  • In this module, in contact_storage_module_preinstall(), if the original 'contact_message' entity type is not found, simply return and do nothing. I think it will be fine, because contact_storage_entity_type_alter() will be invoked before the entity type is created.

I should create an alternative MR to see if this works.

🇩🇪Germany donquixote

The "container_rebuild_required: true" does solve the problems in our dependency chain.
(which is likely the same that @andras_szilagyi is working with)

For the other changes, as mentioned by @rodrigoaguilera, perhaps they don't need to be part of this MR.

I think the entity.definition_update_manager does not know about the contact_message entity type because of this.

Yes.

There are actually two problems:

  • The entity types from other modules in the list are not yet present, when hook_module_preinstall() fires.
    This causes the problem with contact_message entity type missing.
    This problem goes away if we install 'contact' before as a separate step.
  • The plugin definitions from other modules in the list are not yet present, when hook_module_preinstall() fires.
    (because plugin cache is cleared only after that hook, in ModuleInstaller)
    This becomes a problem if other modules in the list add additional fields to 'contact_message' via hook_entity_base_field_info(), which depend on plugins that are provided in one of the modules to be installed.
    This contact_storage_module_preinstall() calls EntityDefinitionUpdateManagerInterface->installEntityType(), which somewhere down gets the schema for 'contact_message', which needs the plugin for each field.
    This could be avoided if core ModuleInstaller would clear the plugin cache before invoking hook_module_preinstall().
🇩🇪Germany donquixote

I noticed that "container_rebuild_required" does not really help with config dependencies, because the following code happens before modules are grouped based on 'container_rebuild_required':

    // Check the validity of the default configuration. This will throw
    // exceptions if the configuration is not valid.
    $config_installer->checkConfigurationToInstall('module', $module_list);
🇩🇪Germany donquixote

@berdir

I agree with #14.

Are you sure this is the comment number you want to reference? That comment does not look like something to agree or disagree with..
(I will delete this comment text after it has been clarified)

🇩🇪Germany donquixote

I'm talking about the HookImplementation value objects. They are just data, not services. They're only in the container because it's convenient, as the container is the one thing that's readily available to HookCollectorPass. I did propose a specific alternative to having this in the container, and I'll try to come up with a proof of concept for that when I have some time and energy.

This reminds me of how hux module works / used to work.
One rebuild-time discovery mechanism to register the hook classes.
Another discovery mechanism to find the actual implementations, which happens at runtime and is cached.
This works as long as the hook services remain public, and we just blindly register all classes in a given directory.

One nice thing of having it all in the container is to have a single source of truth. While the container gets rebuilt, we have the old container which contains the old hook services and the old hook implementation lists. The new container contains the new services and the new lists. We never have to wonder whether our cached lists are up to date with the container.

What I'd like to experiment with is what I mentioned in my previous comment, using keyvalue and a cache. Which is similar to state, but state is fairly frequently invalidated and needs to be rebuilt, this data should never change except when doing a container rebuild.

Conceptually, all of this is cache, not state. So keyvalue seems wrong. But maybe I am missing part of the picture.
Also, if we do anything like this, we should maintain different versions of this cache per "container version", to avoid crazy effects on concurrent requests during cache rebuild.

with MR !11980, the smallest are 247, so only twice as large as the closure. I understand that's because of not using ProceduralCall, so we don't need to serialize those objects. I wonder if we could build on that. What if we just store the raw data as parameter, and then call HookImplementationsList::load() ourself? Instead of the indirection through a service closure? So we'd go back to a parameter, not services.

At some point we need to lazily obtain the service instances.
It has to be lazy, because in a given request we only want to instantiate the services we need.
So:

  • Something in this mechanism needs a container injected and call $container->get(). This can either be in ModuleHandler or in a new dedicated services that provides implementation lists.
    Arguments have been made that injecting a container is not desirable..
  • The services for hook classes must remain public. Otherwise, symfony will "forget" all these services because they appear unused, and also code outside the container cannot obtain them via $container->get().
🇩🇪Germany donquixote

Still catching up on the discussion, so not sure which MR is the one to test. Based on a quick test, MR !12311 seems to increase the container size, not decrease it. On umami, from 762749 to 830427. For comparion, the container size on my minimal distribution size is 1.3MB, so way larger still and it will grow too. The other MR is about the same size as before, mostly because the service definitions are smaller, see below.

There was an open discussion whether to keep the ProceduralCall service for procedural implementations.
Technically this is dead weight, but somebody might use ProceduralCall to target a procedural implementation for ordering.
We kept both PRs up.

With event-based hooks, ProceduralCall was needed to register procedural implementations in EventDispatcher, afaik.

🇩🇪Germany donquixote

@berdir
One design consideration here was private services vs public services.

E.g. with plugins: Currently the container does not know about every single plugin instance. But this only works because services are public, and we make the container available to the plugin create method.

By each hook class being a service, and making them available to ModuleHandler through the container, the hook system will not be an obstacle for making services private.
This is also currently thie case, where the container adds stuff to EventDispatcher.
Also, in a previous version of this, I wanted to pass the container to ModuleHandler, but there were arguments against this.

If we are concerned about container size, I see two directions that could be explored:

  • Keep all the services, but find a way to not have to unserialize the full container on every request.
    Wasn't there some way to compile the container to php?
  • Convert a lot of things (e.g. hook classes) to not be 1st class services, and be more like plugins, where we provide the container on instantiation. This requires public services.

We could also go somewhere in between: Let every hook class be a service, but don't add the additional services and closures for ImplementationList objects. Instead, have some kind of factory service that can dynamically create an ImplementionList object for each hook. But this still means that we need to have the full list of implementations in the container.

🇩🇪Germany donquixote

I have some concerns about the container size from what I saw in the theme OOP issue, all the extra objects and services being serialized separtely aren't going to help.

Regarding theme hooks, yes, this would mean more services.

For the module hooks discussed in this issue, we already do have one service registered per class with hooks, and then some overhead in the event dispatcher. definition.

With this issue we don't get any changes to the container, because here the ImplementationList objects are created at runtime.

With the "separate hooks from events", we get one additional service per hook (if it has at least one implementation).
At the same time, we lose the additional overhead for the event dispatcher definition.
For the "container size" as in number and size of definitions, this could go either way.
For number and cost of instantiated services in an average request, I would hope that number to go down significantly.

I'll try to do some tests on a real project with this.

This would be nice to see!

🇩🇪Germany donquixote

The fix is specifically in the widget, the tests are not covering that.
I can do a FunctionalTest for the widget.

🇩🇪Germany donquixote

If we can add a test case as described in the proposed resolution this would be perfect. Should be easy to write a similar test for all widgets.

It makes sense, but:
It seems currently none of the widgets have any test.
Does it make sense to add a test that _only_ tests the midnight != NULL case, if we don't add general tests for the widgets?
Or am I missing something?

🇩🇪Germany donquixote

Since this change, it seems we can no longer set 12:00 AM as a time value in a TimeRangeWidget.

  public function massageFormValues(array $values, array $form, FormStateInterface $form_state) {
    foreach ($values as $delta => $value) {
      if (empty($value['from'])) {
        $values[$delta]['from'] = NULL;
      }
      if (empty($value['to'])) {
        $values[$delta]['to'] = NULL;
      }
    }
    return $values;
  }

This code will replace 0 (12:00 AM) with NULL.
Instead of empty() we should check for empty string.

      if (($value['from'] ?? '') === '') {
        $this->messenger()->addStatus(var_export($value, TRUE));
        $values[$delta]['from'] = NULL;
      }
      if (($value['to'] ?? '') === '') {
        $values[$delta]['to'] = NULL;
      }

Going to open a new issue..

🇩🇪Germany donquixote

Module weights are still relevant

hook_module_implements_alter is now deprecated so we should use the hook ordering parameters.

We have hook ordering attributes, but the order in which these are discovered and applied still depends on module order, and thus module weight. We may introduce new metta ordering mechanisms, but for now we are still in the same situation as with hook_module_implements_alter().

There are also other subsystems (plugin discovery, service discovery, event subscribers) that depend on module order and thus module weight. Event subscribers have their own weight/priority system, but the base order is still module order.

This said: It is not clear to me that configuration is the best place to set these weights.
Instead, there could be a system to declare module weights programmatically and not store explicit weights in config.

Dependency order is brittle

It was suggested here that modules should be ordered based on their dependencies. I don't like it, I suspect it to be brittle/fragile. If a new version of a module adds or drops a dependency, suddenly event subscribers are invoked in a different order.

Unclear issue goal

Also, the issue description proposes to deprecate and remove module_set_weight(), but it does not mention removing the actual weights the in core.extension.yml configuration. But then we find arguments as if we actually plan to remove these stored weights in the configuration.

Proposal: Keep module weights, provide alternative ways to set it

What we could do is o provide a service method to set a module weight, to use instead of the function.
This could be part of ModuleInstaller, and it could have a similar effect as installing or uninstalling a module, with a container rebuild.
However, as long as install and update hooks are procedural, there is no further damage from module_set_weight() being procedural too.

We could also provide a way for a module to set its own weight in its *.info.yml, instead of having to call something in hook_install().

🇩🇪Germany donquixote

Let's see if project update bot will post anything here to tell us to close the issue..

🇩🇪Germany donquixote

I touched on something similar in a MR comment, but I agree with #31. As proposed, the `extensions` property is a flat array AFAICT, and it would not be possible to differentiate between themes and modules in the future without refactoring.

I imagine that even if we reuse this for theme hooks, we would have separate instances for modules and themes.
So we would never have a mix of modules and themes.
You would know from the context whether $list->extensions gives you only modules or only themes.

But, even if we want to reuse large parts of this ImplementationList class, we might still want to maintain two separate classes, even if they are similar, and even if the only difference is the name of these properties. DRY can be overrated.
I suspect there would be more differences, some methods might not be needed for themes.

🇩🇪Germany donquixote

Let's actually change the names from module / Modules to extensions so we can use this for themes too if we make them available.

I am not sure about this change.
The problems that we solve with this ImplementationList are (currently) very specific to module hooks.

For module hooks, every implementation needs a module associated with it, for these reasons:

  • In ModuleHandler->invokeAllWith(), the $module is passed to the callback.
  • In ModuleHandler->getCombinedListeners() (for ->alter()), the implementations are grouped by module before they are passed to hook_module_implements_alter().
  • In ModuleHandler->getCombinedListeners() (for ->alter()), when we apply the ordering rules, these also get access to the modules.

Also, until now, we don't have services registered for themes.

For theme hooks, we might be able work with a simpler ImplementationList which does not track the theme names.

On the other hand: If, in the future, we do align theme hooks more with module hooks, we may want to refactor parts of ModuleHandler and HookCollectorPass, moving shared functionality into a shared class.

Personally I would suggest to keep ImplementationList focused on modules for now, and also use the term "module" in variable names.
The class is already marked as internal, and the properties are protected, so it will be easy to change in the future.

In the future when we reuse this for themes we can either rename the properties, or we can create a new class, depending on the needs.

Until that happens, the code will be easier to understand if it says "$module" rather than "$extension".
We name things for what they are, not for what they could be.

🇩🇪Germany donquixote

donquixote changed the visibility of the branch 3540505-add-tests-for to hidden.

🇩🇪Germany donquixote

Related: 🐛 TypeError: htmlspecialchars(): Argument #1 ($string) must be of type string, array given in htmlspecialchars() (line 432 of core\lib\Drupal\Component\Utility\Html.php) Active

A question is whether we should already validate and filter placeholder arguments in FormattableMarkup::__construct(), or wait for __toString() to be called as it is now.

Doing it in __construct() will provide a more useful backtrace.
However, if there is any code that relies on non-Drupal placeholders returned from FormattableMarkup->getArguments(), even if they won't be inserted in the string, that code would no longer work as designed.

So maybe we can validate in __construct() to call trigger_error(), but filter in ::placeholderFormat() which is called in __toString(), where we then don't call trigger_error() again.

🇩🇪Germany donquixote

The main scenario is that FormattableMarkup::__construct() allows ['@placeholder' => NULL] values in $arguments, but FormattableMarkup::__toString() blows up if $this->arguments contains such values.

$v = t('Hello @null', ['@null' => NULL]);  // No error.
$v->__toString();  // Error.

This is a regression. because in older Drupal versions this silently worked (first I think with a notice, later with a deprecation warning).

The problem is that the error is separated from its cause, because the translatable string can travel far before it causes the error.

The cause can be an entity label being NULL. E.g. in MediaForm::form() we have this:

      $form['#title'] = $this->t('Edit %type_label @label', [
        '%type_label' => $media_type->label(),
        '@label' => $this->entity->label(),
      ]);

Where $entity->label() is technically allowed to return NULL, and it is possible to programmatically create a media entity without a title.

---------------

There are two directions where we can take this:

  1. Soften Html::escape() and/or FormattableMarkup::placeholderFormat() to handle NULL values.
    Both FormattableMarkup::placeholderFormat() and Html::escape() are static methods, so we cannot cleanly inject a logger.
    But we could just `@trigger_error(.., E_USER_NOTICE)`. Or we restore the deprecation.
  2. Harden FormattableMarkup::__construct(), to reject NULL values.

In the past we did deprecate NULL for FormattableMarkup::placeholderFormat() and Html::escape(), but we failed to deprecate it for FormattableMarkup::__construct(), so our old deprecation was worthless.

🇩🇪Germany donquixote

The page that you are testing, are any hooks called on a warm cache?
And is ModuleHandler instantiated at all in these requests?

If ::addListener() calls are what slows down these requests, maybe we could go even further and have a separate event dispatcher with only those events that are commonly invoked in requests with warm cache?

🇩🇪Germany donquixote

can we just have two event dispatchers?

We can.
But what does EventDispatcher give us for hooks? We don't really use it for anything other than collecting callbacks.
And then we cannot just use the callbacks, we have to add meta information about which module name to pass to a hook call, which file to include for a procedural implementation, and ordering rules to reorder "mixed" alter implementations.
I still think the change proposed here is an overall improvement.

I don't mind if you want to do this change as an incremental improvement.
It is certainly simpler in terms of how much code is changed.

🇩🇪Germany donquixote

The conceptual difference is that the class and method targeted by ReorderHook must match the hook assigned to that class and method as well. (in the test reorder is looking for test_b_subtype_alter and the hook for that class is actually test_a_supertype_alter)

In donquixote's approach this reorder still has an effect, I think that is also a bug.

I don't think it should be seen as a bug.
(the cross-hook ordering, not the TypeError)
The desired behavior in this scenario was not discussed in depth at the time we introduced these attributes.
The current behavior seems just as valid as the new behavior proposed by @nicxvan.
Changing it now does somewhat break existing behavior, even though it is a niche case.

This said, I could see one argument to see it as a bug: There is inconsistency with #[RemoveHook], which does not show this cross-hook behavior.

@donquixote mentioned in slack that he'd like to add a test for:

I added the test case for this in 3536470-ReOrderHook-on-non-implemented-hook-current-behavior and in 3536470-ReOrderHook-on-non-implemented-hook.

The goal was to provide a more relevant use case where reordering across hooks might actually be desirable.
In the test it is used to "reinforce" the ordering on a base alter hook implementation so it can overrule ordering for subtype alter implementations.

Overall I feel like our attributes are not well-suited for these competitive situations.
What if you really want to be "last, after X which also wants to be last, but before Y, which also wants to be last"?
And what if X and Y actually come from different hooks, which are all called togeher in ->alter([*, *, *], ..)?

The OrderAfter and OrderBefore were meant for this purpose, but I don't believe in them.
They are going to be very unreliable if the target implementation we order against is also reordered. Now the success of OrderAfter depends on _when_ that rule is applied.
If A.Last is applied first and then B.OrderAfter(A), we get ***, A, B.
If B.OrderAfter(A) is applied first and then A.Last, we get **, B, **, A.

The most reliable solution would be numeric weights/priorities.
We don't need to give weights to implementations, instead we can give weights to ordering rules.
(not in this issue, of course)

If we have a weights system, the use case mentioned before would mostly disappear.

Until then, maybe we can say that the relative order of implementations with competitive Order::First or Order::Last is not fully guaranteed.

🇩🇪Germany donquixote

donquixote changed the visibility of the branch 3536470-ReOrderHook-on-non-implemented-hook-current-behavior to active.

🇩🇪Germany donquixote

donquixote changed the visibility of the branch 3536470-ReOrderHook-on-non-implemented-hook-current-behavior to active.

🇩🇪Germany donquixote

The new version is a bit verbose but hopefully more clear.

The test-only branch illustrates the current behavior.

🇩🇪Germany donquixote

if you try to affect the order relative to a hook that doesn't exist, this still affects the order of other hooks that do exist.

Not really - unless I misunderstand.

When I see "order relative to", I think of #[ReorderHook(..., new OrderBefore(...))].
This is not what this is about.

The best example is in comment #17.

As you know, ->alter() can be called with more than one "alter type", which then maps to more than one alter hook.
E.g. ->alter(['form', 'form_node_form', 'form_node_article_form'], ..).
The hooks are 'form_alter', 'form_node_form_alter', 'form_node_article_form_alter'.

Each of these hooks can have order operations that were declared either with #[ReorderHook], or by setting an order in the #[Hook] attribute for a specific implementation.
All of these order operations will be applied when the implementations from these multiple hooks are "merged" at runtime.
When this happens, an order operation from e.g. 'form_node_form_alter' is allowed to target and reorder an implementation of 'form_alter'.

How?

Most of the #[ReorderHook] will target an implementation of the same hook that is passed in the first parameter of the attribute.
E.g.
- #[ReorderHook('form_alter', MyClass::class, 'formAlter', Order::First)], or
- #[ReorderHook('form_node_form_alter', MyClass::class, 'formNodeFormAlter', Order::Last)],
assuming that ::formAlter() implements hook_form_alter(), and ::formNodeFormAlter() implements hook_form_node_form_alter().

However, a #[ReorderHook] can target (and thus reorder) an implementation from a different hook.
This operation is only applied when both hooks are used together in an ->alter() call.

E.g. a completely silly example would be completely unrelated hooks #[ReorderHook('mail_alter', SomeClass::class, 'toolbar', Order::First)], assuming that SomeClass::toolbar() implements hook_toolbar().
These are never used together in ->alter(), and the second one ('toolbar') is not even an alter hook.
Therefore this #[ReorderHook] is completely pointless and has no effect.

However, we can #[ReorderHook('form_node_form_alter', MyClass::class, 'formAlter', Order::First)].
Here the target hook is 'form_node_form_alter', but the target method implements 'form_alter'.

What could be the intent behind this?

  • For any other form, when ->alter(['form', 'form_other_form'], ..) is called, MyClass::formAlter() should be in its regular order position, between other implementations.
  • For 'node_form', when ->alter(['form', 'form_node_form'], ..) is called, MyClass::formAlter() should be first, before other implementations.

Currently, this works as described _if_ there is at least one implementation of hook_form_node_form_alter().
This implementation could be from a different module that we don't care about.
If there is no such implementation, we get the TypeError as reported in this issue.

🇩🇪Germany donquixote

So the main difference is this:

  • Currently, a #[ReorderHook] targets an implementation (class + method) whenever it is invoked with the specified hook as part of the hooks/types passed to the ->invokeAll() or ->alter() call.
  • Instead, nicxvan suggests that a #[ReorderHook] should target an implementation (class + method) only if it is registered with the specified hook.
    This means the scenario in #17 would no longer be supported.

The distinction between "invoked with" and "registered with" is relevant only in multi-type alter calls like ->alter('form', 'form_myform').
An implementation is (typically) registered for only one hook, but the alter call can add more hooks into the mix.

The proposed change would a BC break.
The time to suggest this would have beeen pre 11.2.0 :)

What I would propose is to only fix the error case, and make it align with the non-error case.

🇩🇪Germany donquixote

Haven't fully read 17, but this only happens for alter calls, this does not occur as far as I tested for non alter hooks.

The "steps to reproduce" show how this applies to non-alter hooks.

Simply add this attribute (and nothing else) on a class in a Hook namespace:

use Drupal\Core\Hook\Attribute\ReorderHook;
use Drupal\Core\Hook\Order\Order;

#[ReorderHook('non_implemented_hook', 'NonExistingClass', 'nonExistingMethod', Order::Last)]

(don't forget the imports, or it won't work)

It also does not matter if the target class and method exist:
- If the hook has (other) implementations, but the target class and method do not exist, there is no error.
- If the hook has no implementations, but the target class and method do exist, just registered for a different hook (or no hook), we do get the TypeError during discovery.

🇩🇪Germany donquixote

This is really about targeting a hook that (currently) has no implementations.
If you target a hook that does have implementations, but the targeted implementation does not (currently) exist, there is no problem.

Also it is not "fatal" per se: A TypeError is catchable, it is up to Drupal whether to catch it.

I don't think that is quite right, ReorderHook can only target one implementation, if that implementation doesn't exist then it's rules should not apply at all.

Specifically:

so you target a form_alter impl in the scope of a form_form_id_alter

but if the form_form_id_alter doesn't exist then no ordering should happen, if the module want's to also order the form_alter they should target that separately.

The "if the form_form_id_alter doesn't exist" does not mean anything.
A hook may never be called or it may have no implementations.
There is no such a thing for a hook as to "not exist".

During discovery, we can only determine whether a hook has implementations or not.
But it may still be called.

#[ReorderHook('form_myform_alter', self::class, 'formAlter2', Order::First)]
class MyHooks {
  #[Hook('form_alter')]
  public function formAlter1(array &$form): void {
    $form['#alter_calls'][] = __METHOD__;
  }
  #[Hook('form_alter')]
  public function formAlter2(array &$form): void {
    $form['#alter_calls'][] = __METHOD__;
  }
}
class MyTest extends KernelTestBase {
  public function testFormAlterReorder(): void {
    $this->assertFormAlterCallOrder([
      MyHooks::class . '::formAlter1',
      MyHooks::class . '::formAlter2',
    ], 'form');
    $this->assertFormAlterCallOrder([
      MyHooks::class . '::formAlter2',
      // The #[ReOrderHook] rule is applied if 'form_myform' is within the alter types.
      MyHooks::class . '::formAlter1',
    ], ['form', 'form_myform']);
  }
}

Currently this will cause the TypeError for which this issue exists.
But we can easily craft a version of this scenario that does not crash: We simply add an implementation for hook_form_myform_alter().

#[ReorderHook('form_myform_alter', self::class, 'formAlter2', Order::First)]
class MyHooks {
  #[Hook('form_alter')]
  public function formAlter1(array &$form): void {
    $form['#alter_calls'][] = __METHOD__;
  }
  #[Hook('form_alter')]
  public function formAlter2(array &$form): void {
    $form['#alter_calls'][] = __METHOD__;
  }
  #[Hook('form_myform_alter')]
  public function formMyformAlter(array &$form): void {
    $form['#alter_calls'][] = __METHOD__;
  }
}
class MyTest extends KernelTestBase {
  public function testFormAlterReorder(): void {
    $this->assertFormAlterCallOrder([
      MyHooks::class . '::formAlter1',
      MyHooks::class . '::formAlter2',
    ], 'form');
    $this->assertFormAlterCallOrder([
      MyHooks::class . '::formAlter2',
      MyHooks::class . '::formMyformAlter',
      // The #[ReOrderHook] rule is applied if 'form_myform' is within the alter types.
      MyHooks::class . '::formAlter1',
    ], ['form', 'form_myform']);
  }
}
🇩🇪Germany donquixote

Imo we should merge the tests-only branch "3533944-alter-non-unique-current-behavior" before we do anything else.
https://git.drupalcode.org/project/drupal/-/merge_requests/12754/diffs
Maybe this needs a separate issue?

🇩🇪Germany donquixote

Alternatively if we change it back we fix the behavior of 11.2.

We could still do that as a follow-up, if we really want to.
I think for now the least disruptive solution is to preserve the 11.2.x behavior while removing the notice.

Btw currently we use FormattableMarkup to produce that notice, and as a result we get ugly escaped html in the placeholder variables.
Any other trigger_error() call in core uses raw string operations.

I would advocate for array_unique but that is forest impression without deep consideration.

Btw I notice that it matters where we call array_unique().
If we call array_unique() directly in ->alter(), it will change how the #[ReOrderHook] order operations are applied.
If instead we call array_unique() in ->getCombinedListeners() to get the listener lists, but we use the original (non-unique) hook names for order operations, we can keep the current behavior.

However this distinction is quite theoretical.
Most of the relevant cases are with form alter, and here the pattern is only ever ['form', 'form_x', 'form_x'], so ['a', 'b', 'b'].
To reproduce the order operation glitch, we need a pattern like ['a', 'b', 'a'].

🇩🇪Germany donquixote

In addition to array_unique() we could deprecate passing a non-unique list, so that it will _also_ be fixed in all calling code.

🇩🇪Germany donquixote

We'll definitely need sign off on collapsing the alters.

We should have asked for this before 11.2.0 release.
(And we should have added a test for this case)
Now if we change it back to have implementations repeated, we break with the behavior of 11.2.0.

🇩🇪Germany donquixote

The workaround exists since Drupal 9.0.* (or earlier).

Behavior in older Drupal versions

In Drupal prior to 11.2.x, when calling ->alter(['a', 'a'], ..) with the same alter type repeated, the implementations would simply be repeated.
The order was by module and then by alter type:

class HookAlterTest extends KernelTestBase {
  protected static $modules = [
    'procedural_hook_test',
    'system',
  ];
  public function testAlter(): void {
    $this->assertAlterCallOrder([
      'procedural_hook_test_test_main_alter',
      'procedural_hook_test_test_other_alter',
      'procedural_hook_test_test_main_alter',
      'procedural_hook_test_test_main_alter',
      'procedural_hook_test_test_other_alter',
      'system_test_main_alter',
      'system_test_main_alter',
      'system_test_main_alter',
    ], ['test_main', 'test_other', 'test_main', 'test_main', 'test_other', 'test_x']);
  }
  [...] (not sharing the full class here)

Behavior since 11.2.x

Since 11.2.x, every implementation is only executed once, and we see the notice as reported here.

  public function testAlter(): void {
    $this->assertAlterCallOrder([
      'procedural_hook_test_test_main_alter',
      'procedural_hook_test_test_other_alter',
      'system_test_main_alter',
    ], ['test_main', 'test_other', 'test_main', 'test_main', 'test_other', 'test_x']);
  }

However, both the deduplication and the notice were actually meant for a different case, where the same method is registered for two distinct alter hooks, which are then executed together with ->alter().
E.g. ->alter(['a', 'b'], ..) with #[Hook('a_alter')] and #[Hook('b_alter')] on the same method.
The case with ->alter(['a', 'a'], ..) was not really considered at the time. This is why the notice looks so strange.

Solution

Now we need to decide what should be the "correct" behavior.
I think it makes sense to execute every implementation only once.
But we don't want the same notice, if the duplication is caused by a duplicate alter type passed to ->alter().

The easiest might be to simply call array_unique() in ->alter().

class ModuleHandler ... {
  [..]
  public function alter($type, &$data, &$context1 = NULL, &$context2 = NULL) {
    [..]
    if (is_array($type)) {
      $type = array_unique($type);
      $cid = implode(',', $type);
    }
    else {
      $cid = $type;
    }
🇩🇪Germany donquixote

I think for now we want to make sure it works as advertised for for a limited set of supported scenarios.
Later we can incrementally add support for advanced autowire features.

🇩🇪Germany donquixote

From a functional perspective, it is fine to have field group as part of Drupal CMS.

From an architectural perspective, the structure of entity form and view display should support grouping out of the box, so that field_group does not need the acrobatics that it currently does.

We should either reopen this and move to core queue, or create a follow-up issue.

🇩🇪Germany donquixote

I just added something in "Proposed text".
But tbh, perhaps we can just drop this part instead, as it is already implied by other parts, as pointed out by #19.

I think I would rather disallow type-hinting with \stdClass at all to resolve this, then it not being allowed in the PhpDoc would not be a problem. Type-hinting on \stdClass seems like bad practice to me and it should be discouraged if it isn’t yet in our standards.

This proposal from #10 seems odd.
The doc type (and also the php type hint) should describe what is, not what we would wish to see instead.
At the time the code is refactored to allow different object types, the doc and the php type can change accordingly.

🇩🇪Germany donquixote

Actually on second thought this might need a CR since we're adding a new object and we're changing some methods.

Do we write change records for protected methods and protected properties?
And do we write change records for new classes marked as internal?

🇩🇪Germany donquixote

It was a helper for Drupal install and updating and wasn't even working there for ages.

It was working before the OOP hooks via EventDispatcher. So in 11.0.

Unless someone pops up though with a valid use case this is already deprecated. I'm not sure it's worth the effort to change.

Normally "deprecated" code still works as advertised or as expected, you should stop using it to prepare for the next major version.

🇩🇪Germany donquixote

My main concern here is whenever we refactor this, e.g. in 📌 [pp-1] Explore adding an ImplementationList object for hook implmentations Active , we are replicating severely broken functionality.

🇩🇪Germany donquixote

I think this is a dead end issue :)
There are always services in the call stack, and so far this seems to be ok.
The services from the old container create a new container, and that's how it works.

🇩🇪Germany donquixote

I think this type of config was never really meant to be handled in hook install but worked.

I don't think the creators of this hook had any preference whether or not people use it to alter configuration from config/optional from another module.

I'm wondering if it's worth considering a new hook: hook_post_install_optional

To me it seems like a clear regression.
We should fix core, rather than asking people to change their code.

🇩🇪Germany donquixote

Here is a list of commits for ModuleInstaller since 11.1.8.

* 4837c52b Issue #3497105 by quietone, borisson_: Fix LineLength for inline comments
* 101e0606 Issue #3496558 by alexpott, godotislate: Install entity definitions for all modules in batch install before simple config
* 6e7715a8 Issue #3492282 by nicxvan, longwave, catch: Refactor away ModuleInstaller::invokeAll()
* 59ce5ede Issue #3494212 by godotislate, alexpott, larowlan: Modules that provide fields for entities of other modules can't be installed
* 57583222 Issue #3487154 by quietone, daffie, smustgrave: Fix Drupal.Commenting.FunctionComment.MissingReturnComment in core/lib/Drupal/Core/A-E
* 58be5f12 Issue #3492235 by catch, longwave, mikelutz: Default container_rebuild_required to FALSE
* 16967de8 Issue #3492705 by catch, nicxvan, mikelutz, longwave: Install modules with container_rebuild_true set by themselves
* f8090e61 Issue #3416522 by catch, alexpott, longwave, phenaproxima, nicxvan, wim leers, smustgrave, larowlan, berdir, godotislate, dww: Add the ability to install multiple modules and only do a single container rebuild to ModuleInstaller
* eab46c49 Issue #3486995 by nicxvan: Clean up how ModuleInstaller invokes hooks around installing other modules
* 76323d65 Issue #3490296 by nicxvan: Mark hook_install_tasks and hook_install_tasks_alter as procedural only
* 2c9e1fa9 Issue #3486462 by nicxvan: Support #Hook for several hooks called by ModuleInstaller

🇩🇪Germany donquixote

The "fully-qualified type" has been kept because that was agreed to earlier in the issue and by a committer meeting.

My problem with this "fully-qualified" is that this can only apply to classes/interfaces.
E.g. if you have this:

/**
 * @param int $count
 * @param callable(array<string, \Drupal\node\NodeInterface>): void $callback
 * @param \Drupal\node\Whatever $whatever
 * @param \Closure $another_callback
 */

Here some parts of these @param docs have fully-qualified names, for other parts this term does not apply.

To recap from https://www.php.net/manual/en/language.namespaces.rules.php:

  • A "fully qualified name" can be a class, interface, trait, enum, namespace, function or constant (or other, perhaps?) name with namespace and leading namespace separator '\".
  • A "qualified name" can be a class, interface, trait, enum, function or constant name with namespace and without leading namespace separator '\".
  • (A "fully-qualified class name" (FQCN) is a class name (or interface, trait, enum) with namespace and leading "\".)
  • (A "qualified class name" is a class name with namespace but without the leading "\".)
  • A built-in type does not have a fully-qualified name.

As such, it does not really make sense to ask for the entire type to be "fully qualified", only specific namespace-aware symbols within the type can be fully-qualified.
And here I thought it is preferable to just define in one place that all namespace-aware symbols in any doc type must be fully-qualified, instead of repeating this everywhere.
(Of course we might at some point relax this requirement, but this is out of scope for this issue.)
(I think the main argument to use FQN is to not have imports only for doc types.)

🇩🇪Germany donquixote

We are still losing parts of the existing text, without an explicit intention to do so.
Especially the points about @return tag being required.

Let's try again.

  • Each parameter of a function must be documented with a @param tag (see exceptions below).
  • If a function has a return value, it must be documented with a @return tag (see exceptions below).
  • If there is no return value for a function, there must not be a @return tag.
  • For most functions (see exceptions below), summary lines must start with a third person singular present tense verb, and they must explain what the function does. Example: "Calculates the maximum weight for a list."
  • Functions that are easily described in one line may be documented by providing the function summary line only (omitting all parameters and return value).
  • Each parameter of a function must be documented with a @param tag (see exceptions below).
  • A @param tag must have a description, unless all of the following are true:
    1. The parameter is type-hinted.
    2. The parameter type is a single class, interface, or scalar and does not allow null.
      • For a class or interface, the parameter name is derived directly from the type name. For example, >code>QueueWorkerManagerInterface $queueWorkerManager.
      • For a scalar, the parameter name is self-documenting.
    3. A description would not add useful information.
    4. The type in the @param tag is identical to the parameter type in the PHP function signature.
  • If a function has a return value, it must be documented with a @return tag (see exceptions below).
  • If there is no return value for a function, there must not be a @return tag.
    1. The return is type-hinted.
    2. The return type is a single class or interface and is not nullable.
    3. A description would not add useful information.
  • For most functions (see exceptions below), summary lines must start with a third person singular present tense verb, and they must explain what the function does. Example: "Calculates the maximum weight for a list."
  • Functions that are easily described in one line may be documented by providing the function summary line only (omitting all parameters and return value).
🇩🇪Germany donquixote

Another problem with the current proposal is what is being replaced.

In fact, the "Current text" says nothing about parameter description or return description.

Further below we have separate sections about "@param" and "@return":

@param: Function parameters

The @param tag is used to document a parameter within a function docblock.

Syntax examples:
[...]

Syntax notes:
The @param tag is followed by an optional data type indicator, then the variable name of the parameter, and then a newline. The following paragraph is considered by the API module to be documentation. The API module formats the information on the same line as the @param using a strong HTML tag.

Drupal standards: The documentation is indented two spaces (see example). Data types are required to be included as of Drupal 8.x. Optional parameters are indicated by (optional); include information about the default value only if it is not obvious from the function signature.

Even here it is not fully clear to me that the param description is required.
It says "The following paragraph is considered by the API module to be documentation.", but does this mean it is required?

Perhaps some of the changes can move here. But I am not really sure.

🇩🇪Germany donquixote

Maybe 2 should be

the parameter is a single type

That covers is not null able, classes and escalates right?

It should, but I can see how some people might find it ambiguous when looking at `?MyClass` types.
I always prefer the explicit "don't make me think" style, even at the cost of having some perceived redundancy.
It is like in Monopoly go to jail card, which says "Do not pass go" and then "Do not collect $200".

I have updated the IS with the proposed text in #154 and the suggestion in #159.

The current version is contradictory.
We first talk about "class or interface", then later we talk about scalars.
Either we omit scalars from the proposal (which I prefer), or we have to include them in condition 2.

-----

Another problem with the current proposal is what is being replaced.
If we truly replace the entire section in "Current text", then we start with "An @param tag must have a description, unless ...", without having specified when a "@param" tag is needed.

The "Current text" says "Each parameter of a function must be documented with a @param tag (see exceptions below).". We seem to drop this completely.

Even if we do want to relax this requirement (which I would consider wrong for the scope of this issue), there should at least be something before we start talking about the description for a @param tag.

🇩🇪Germany donquixote

If would be nice to be able to omit a (IMO useless) return doc like this:

* @return string|null
  The title, or null if there's not one.

I don't think it is useless.
In many cases, the NULL case is actually more interesting.

E.g. "The entity identifier, or NULL if the object does not yet have an identifier.".
For entities, this generally means that the object is a "stub" and has not been saved in the database yet.

"The title, or null if there's not one."
This could mean the specific entity does not have a title, or that entities of this type never have a title.
Given that this description will be in an interface method, it is worthwhile to have a description.

I think it is best to move piecemeal on relaxing the doc requirements, and for each step we can look at specific examples.

🇩🇪Germany donquixote

(I was going to post the below yesterday but forgot to submit)

@donquixote, I really think saying "class or interface name" is a lot clearer than "type name", even with your example with ducks.

I assume it would apply to this part:
"The parameter name is derived directly from the type name." -> "The parameter name is derived directly from the class or interface name."
In that case, +1.

I think we do want to include scalars in the proposal; one of the biggest usecases for this is for constructor property promotion I think. Maybe we have a separate issue coding standards about that, but I thought this was part of a list of related policy changes for it.

When I search `@param int \$\w+` in Drupal core and in vendor/symfony, I see many places where I either want to keep the existing description or see it improved. And when just looking for `int \$\w` in vendor/symfony, to find undocumented parameters, in many cases I would like to see a description.

Of course there are cases when the parameter name is clear enough, or when a description would be redundant with the description on the method itself (e.g. for a setter method).

Having a general rule to always require a description is annoying, but if we leave it to the developer, it cannot be enforced by automatic tools. A rule at least forces the developer to think, and it gives the reviewer some leverage.

Also keep in mind that, with named argument calls, it is harder than before to rename an existing parameter that is considered part of a public API. As a consequence, we will have parameters names that are not ideal.

However, since it is easy for me to explain what the names should be for classes and interfaces, but more difficult to explain it for scalars, maybe we should split the policy proposal into two steps?

I think this would be a good idea, yes.
The rule proposed here when this issue was started to determine when to omit the description does not really work for scalar types.
The newly proposed rule "the parameter name is self documenting" would work but is not enforceable.

So I would prefer to handle this in a separate issue.

🇩🇪Germany donquixote

@donquixote, I don't understand. I didn't think we were talking about phpdoc; why did you think I did?

This was not about you, it was about others who may read these rules and think that "type-hinted" could refer to the param doc type.
But maybe saying "type-hinted" is clear enough.

🇩🇪Germany donquixote

I see the "scalar types" was added in #129, and I don't see why.
https://www.drupal.org/node/3376518/revisions/view/13999604/13999612

For my taste, we should simply drop that part:

A "@param" tag must have a description, unless all of the following are true:

  • The parameter is type-hinted.
  • The parameter type is a single class or interface (without allowing null).
  • The parameter name is derived directly from the type name.
    • E.g. "QueueWorkerManagerInterface $queueWorkerManager".
  • A description would not add new information.
  • The type in the @param tag exactly replicates the parameter type in the PHP function signature.

For the first condition, "The parameter is type-hinted.", how could we make it more clear that this is about the php signature and not the phpdoc type?

🇩🇪Germany donquixote

@xjm

Like so:

A parameter description may be omitted, if all of the following are true:

The parameter is type-hinted.
The parameter type is a single class, interface, or scalar type.
The parameter name is derived directly from the type name.
A description would not add new information.
The types in the @param tags exactly replicate the parameter types in the PHP function signature.
Otherwise, a parameter description must be included.

That removes heaps of negation, and reduces it from two sections to one. We could do likewise with the return section.

I like reducing the negation.
We could also drop the last line of "otherwise, ..." if we use "unless".

A "@param" tag must have a description, unless all of the following are true:

  • The parameter is type-hinted.
  • The parameter type is a single class, interface, or scalar type.
  • The parameter name is derived directly from the type name.
  • A description would not add new information.
  • The type in the @param tag exactly replicates the parameter type in the PHP function signature.
  • 🇩🇪Germany donquixote

    @xjm

    on arrays we have "The parameter type is not a single class, interface, or scalar type." which means all arrays would need @param docs still.

    Yes the way I understand it, this would only require description on arrays.

    I would disagree. I would argue that on most scalar types there is something to document:
    - For arrays we want to know the structure, and what the keys and values represent.
    - For strings, we may want to know if it is a machine name or a human-readable label, whether it is raw input or has been sanitized already, etc.
    - For boolean, we want to know what is the meaning of true or false.
    - For integer, we want to know whether it is seconds or milliseconds.

    See what I said in #16.

    The parameter name is derived directly from the type name.

    The idea here was this, if that is still valid:
    "DuckInterface $duck" -> does not need a description.
    "DuckInterface $animal" -> needs a description.
    "QueueWorkerManagerInterface $queueManager" -> needs a description
    "QueueWorkerManagerInterface $queueWorkerManager" -> does not need a description
    "UnroutedUrlAssemblerInterface $url_assembler" -> needs a description
    "UnroutedUrlAssemblerInterface $unrouted_url_assembler" -> does not need a description

    Now up to you what you think about it :)

    🇩🇪Germany donquixote

    @krystalcode (#106)

    The way Git merges changes does not have much to do - if anything - with the sorting of the contents of the code. Conflicts most commonly happen when you edit the same lines.

    The problem with git vs imports is not just about actual conflicts.
    If there is no strict order, there can also be duplication:

    commit 1:

    use N\A
    + use N\NewClass;
    use N\B
    use N\C
    

    commit 2:

    use N\A
    use N\B
    use N\C
    + use N\NewClass;
    

    merge 1 + 2:

    use N\A
    use N\NewClass;
    use N\B
    use N\C
    use N\NewClass;
    

    Let's say we merge this, we notice the problem, and we want to clean it up.
    But we actually have two branches that do this clean-up:

    cleanup branch 1:

    use N\A
    - use N\NewClass;
    use N\B
    use N\C
    use N\NewClass;
    

    cleanup branch 2:

    use N\A
    use N\NewClass;
    use N\B
    use N\C
    - use N\NewClass;
    

    on merge, both of them are gone, and the import will be missing:

    use N\A
    use N\B
    use N\C
    

    Often enough, this does not happen in actual merges, but in a rebase or cherry-pick.
    E.g. commit 1 above might be from a feature branch, then commit 2 is from a different feature that is merged upstream. Then branch 1 gets rebased on upstream, and what appears as "merge 1 + 2" above instead occurs as a commit in the rebase.

    Having a strict order avoids these problems.
    When there is ambiguity, the merge or rebase will see that, and the developer will be forced to sort it out.

    🇩🇪Germany donquixote

    And what would cause Core to not come first? Every contrib module import is already after Core, as its namespaces are lowercase.

    https://git.drupalcode.org/project/aws/-/blob/2.0.x/src/Entity/Profile.p...

    In other words, code outside the Drupal namespace, mostly packages that live in /vendor/.
    In fact, some 3rd party package namespaces will come before "Drupal\...", others will come after.
    All of this is totally ok, and is not any different from e.g. symfony packages, where imports are sorted like "use Composer\...", "use Psr\...", "use Symfony\...", "use Twig\...".

    These imports are going to be mostly managed and read by tools like IDEs or phpcbf.
    As a developer, I don't want to scroll up and look at the imports.
    When I write "new MyClass", the autocomplete will propose a matching qualified class name (class with namespace), and insert the import at the top.

    Any convention we introduce here has to be one that is supported by commonly used tools, and does not require new custom sorting specific to Drupal.

    🇩🇪Germany donquixote

    The ModuleHandler::add() is more broken than we thought.

    It picks up procedural implementations only.

    This is the least part of the problem.
    Much worse is that it removes existing implementations, and prevents them from being added later.
    See 🐛 ModuleHandler::add() can prevent existing implementations from being added Active .

    🇩🇪Germany donquixote

    donquixote changed the visibility of the branch 3519561-DeprecationCanaryTest to hidden.

    🇩🇪Germany donquixote

    donquixote changed the visibility of the branch 3519561-hook-ImplementationList-remove-getHookListeners to hidden.

    🇩🇪Germany donquixote

    donquixote changed the visibility of the branch 3519561-DeprecationCanaryTest to hidden.

    🇩🇪Germany donquixote

    I created an alternative MR where ProceduralCall is still used in hook implementation identifiers and in the parameters passed to ImplementationList::load().
    Tbh, this feels out of place, I don't like it.

    🇩🇪Germany donquixote

    Actually I like keeping them as "obsolete" following the motivation from @elc in #93. I find that very convincing.

    Production build 0.71.5 2024