I am proposing this change to avoid relying on two synced arrays in ModuleHandler:
https://git.drupalcode.org/project/drupal/-/merge_requests/11676/diffs?c...
donquixote → changed the visibility of the branch 3485896-ImplementationList-object to hidden.
donquixote → changed the visibility of the branch 3485896-order-operations-by-specifity to hidden.
I updated the comment, remove the todo.
But to recap my point from our conversation:
We should add cache metadata for dependencies when we actually use them.
In some cases this means adding the same dependency twice in two different methods, so that if we remove one of them, the other is still there.
On the other hand, in general we should not add cache metadata for something that we think will be used elsewhere.
This does not cause much harm for a config object, but it breaks the pattern and can lead people astray, which now rely on that cache metadata being added elsewhere.
Here we are already in that scenario, where we now have to support those hypothetical people.
For this reason it seems wrong to add config cache metadata to CasRedirectData, if nothing in that object really depends on that config.
E.g. there is nothing you could update in the config that would lead to different outward behavior of CasRedirectData.
Of course this can change if an event subscriber adds some data to CasRedirectData that is dependent on that config. Also because the config is made available in the event object. Normally that event subscriber would then have to add the config cache metadata to the CasRedirectData.
CasRedirectData to implement RefinableCacheableDependencyInterface and use RefinableCacheableDependencyTrait
This actually seems like a good idea.
However, perhaps it is better to rethink the architecture for cas 4.x, maybe a lot of it becomes obsolete.
But if this can be done BC-friendly for 3.x, sure.
The change proposed here allows snake_case to coexist in a file with promoted properties, which is good!
So I support this change, in general.
What I still don't like is that right now we don't have a clear preference in core whether to use snake_case or lowerCamel, and that decision is left to MR authors. I have seen MRs that keep the old snake_case but also some that move to lowerCamel on a by-file basis.
I think it is ok to be more flexible for contrib and custom code that wants to use Drupal conventions, but I think for core we should have some kind of guideline.
My personal preference is currently snake_case, partly because of arguments of readability made in the past, which at the time I was skeptical about but which I think do have merit. But I think this should be secondary here, my main point is I don't like it to be arbitrary. The point of conventions is to eliminate meaningless choices.
The conflict with promoted constructor parameters can be resolved in different ways:
- For promoted properties, the respective parameters shall also be lowerCamel (which they have to, naturally).
- For a constructor with at least one promoted property, all parameters shall be lowerCamel.
This will allow to convert other parameters to promoted properties in the future, without breaking named argument calls. - For any constructor, the parameters shall be lowerCamel.
This will allow to convert that constructor to use property promotion in the future, without breaking named argument calls. Everything else remains snake_case. - For any constructor and any static factory, the parameters shall be lowerCamel.
This allows to copy parameter lists more easily between constructor and factory. - Every parameter of any method in a class (that has a constructor or may get one in the future, which means any class) shall be lowerCamel.
(I don't really like this one, it is just to extend consistency half way.) - Every parameter and every local variable in a class file shall be lowerCamel.
(This is what I see in some MRs)
Outside of property promotion, we could also say things like:
- Files copied from elsewhere, or files already existing in Drupal core, can preserve the naming style for parameters and local variables, as long as it is one of snake_case and lowerCamel.
- For newly added class files, the preference as expressed above applies.
All of this can be a follow-up.
I would advocate for closing this, 1-4 micro seconds isn't worth chasing. There are far larger bottlenecks.
The measurements so far don't look promising, but I would not say they are conclusive (yet).
The 1-4 ms are some kind of artificial scenario.
Also, some of this might change after the changes from
📌
Hook ordering across OOP, procedural and with extra types
Active
, when omitting the $modules could provide other benefits.
donquixote → changed the visibility of the branch 3485896-test-file-scan-order to hidden.
donquixote → changed the visibility of the branch 3485896-test-file-scan-order to hidden.
donquixote → changed the visibility of the branch 3485896-donquixote-order-at-call-time-nicxvan-changes-TBD to hidden.
donquixote → changed the visibility of the branch test-file-scan-order to hidden.
Some equation juggling which may or may not be helpful.
Note that `(int)` behaves like floor() for positive numbers and like ceil() for negative numbers, which makes it harder to reason about.
But the bug still happens if we replace (int) with floor().
getRequestTime() = getCurrentTime() - (REAL.CURRENT_TIME - REAL.REQUEST_TIME)
getCurrentTime() = floor(getCurrentMicroTime())
getCurrentMicroTime() = SPECIFIED_TIME + getMicroTimePassed()
getMicroTimePassed() = REAL.CURRENT_MICRO_TIME - TIME_STARTEDgetRequestTime() = floor(REAL.CURRENT_MICRO_TIME - TIME_STARTED)
- REAL.CURRENT_TIME
+ REAL.REQUEST_TIME
+ SPECIFIED_TIMEgetRequestTime() = floor(REAL.CURRENT_TIME + REAL.CURRENT_TIME_FRACTION - TIME_STARTED)
- REAL.CURRENT_TIME
+ REAL.REQUEST_TIME
+ SPECIFIED_TIMEgetRequestTime() = floor(REAL.CURRENT_TIME_FRACTION - TIME_STARTED)
+ REAL.REQUEST_TIME
+ SPECIFIED_TIMEgetRequestTime() = floor(REAL.CURRENT_TIME_FRACTION - TIME_STARTED_FRACTION)
- TIME_STARTED_SECONDS
+ REAL.REQUEST_TIME
+ SPECIFIED_TIME
Here we split float variables into varname_int + varname_fraction, so that we can move the integer part out of the floor() call.
This would not be possible with just (int) due to the asymmetry mentioned above.
The problem would go away if TIME_STARTED_FRACTION is zero.
donquixote → created an issue.
Some people in chat were concerned about serialization of order operations, I think mostly for security.
I am not strongly attached to serialize/unserialize, but I think the security should be ok if we use a hard-coded list of very simple classes for 'allowed_classes'.
From https://www.php.net/manual/en/function.unserialize.php:
Warning
Do not pass untrusted user input to unserialize() regardless of the options value of allowed_classes. Unserialization can result in code being loaded and executed due to object instantiation and autoloading, and a malicious user may be able to exploit this. Use a safe, standard data interchange format such as JSON (via json_decode() and json_encode()) if you need to pass serialized data to the user.If you need to unserialize externally-stored serialized data, consider using hash_hmac() for data validation. Make sure data is not modified by anyone but you.
In this case the data is not "untrusted".
But, even if it was, what exactly can go wrong if the classes we allow are as simple as here?
I would like to know.
(and don't tell me "if the class has a wakeup method...". the specific classes we are looking at don't have it.)
We could actually create a separate issue where we add a part of HookOrderTest for HMIA as baseline, to be merged _before_ this one.
This will make it more visible if or how the ordering changes.
There is no issue with:
#[LegacyModuleImplementsAlter]
#[RemoveHook]
#[ReOrderHook]
Correct, technically these work as designed.
The problem is not with the #[LegacyModuleImplementsAlter] attribute but the greater purpose behind it.
Of course technically the problem is with the new order property in #[Hook].
We want modules to be able to use the new order property of the #[Hook] attribute, while at the same time keeping HMIA around to support older Drupal versions.
This will cause problems as described above in 11.1.x.
What these modules could do is declare compatibility with 10.x and 11.2.x but not with 11.1.x.
We found one problem with #[LegacyModuleImplementsAlter], or the idea behind it.
The solution could be a polyfill in a separate package outside of Drupal core.
Collection order should not have an effect.
It matters because if a module has multiple implementations, their order can be different on different system.
We key by class and method, but we don't sort.
Also, if we would sort by class, we would get a weird position for procedural based on the alphabetic order of ProceduralCall class.
Instead, we should decide whether procedural implementations should be always before or always after oop implementations from the same module (if no other order specified).
#[LegacyModuleImplementsAlter] problem
We found one problem with #[LegacyModuleImplementsAlter], or the idea behind it.
The idea was that a contrib module can support Drupal 11.1.x and Drupal 11.2.x+ by:
- Creating a MYMODULE_module_implements_alter() with #[LegacyModuleImplementsAlter] so that it is only called in Drupal < 11.2.0.
- Adding OOP hook implementations with e.g. #[Hook('my_hook', order: Order::First)], where the "order" part would only be picked up in Drupal 11.2.0 and up.
Problem:
In 11.1.x, the hook attributes will be discovered, but then the `order: Order::First` part will blow up because:
- The enum for Order::First does not exist.
- The named parameter `order: ` does not exist.
I found that objects in a container service parameter definition are no good, it prevents dumping of the container. So I changed it to serialize them explicitly.
Now we have HookOrderTest failing in the pipeline but not in my local.
The reason seems to be that RecursiveDirectoryIterator / DirectoryIterator does not produce a predictable order.
https://stackoverflow.com/questions/29102983/order-in-filesystemiterator
And the way we look for hooks, we mix procedural and oop in the same scan operation.
This means it is not clear whether OOP hooks are first or procedural hooks are first, if from the same module.
I created a separate MR with call time ordering.
Still a bit rough, but this is what I have for now.
To explain it differently:
The sorting done in one specific group of hooks can only be used when ->alter() is called with those exact hook names.
But this is not what the MR is doing.
Instead, when you call ->alter() with hooks/types ['a', 'b', 'c'], you might get the pre-ordered list from ['b', 'c', 'd'] or from ['b', 'x'], which gives you not only the wrong order, but can also give you the wrong implementations. Even if we correct the list of implementations at call time, by adding and removing implementations, the order would still be wrong.
There is no way to fix this within the current proposal, because naturally there won't always be a pre-ordered list for the current combination.
We can check that the implementation exists before adding to the order group.
This is missing the point.
I just pushed another version of the side effect test, which is using existing implementations.
What mechanism are you proposing that does that?
The "extra types" mechanism is just fundamentally flawed.
For an implementation to ask "I want to be ordered together with X" inevitably has an effect on other implementations and other groupings that can exist at call time. This can be mitigated to some extent, but the basic flaw is always present.
The correct thing is to order at call time.
Otherwise we would have to predict every possible combination of hooks that can be passed to alter().
New branch to prove the side effect:
https://git.drupalcode.org/issue/drupal-3485896/-/compare/11.x...3485896...
This commit:
https://git.drupalcode.org/issue/drupal-3485896/-/commit/404bc461f0d3fa5...
So we add a #[ReOrderHook] that targets a non-existing implementation in a non-existing module.
This should have no effect on the order of the other implementations.
But the extraTypes messes things up.
We see the first fail, but I suspect there is more, including possibly implementations disappearing.
Should we take this out of needs review for a bit?
Done.
I was on the fence about splitting hook collector out, but honestly pass is getting huge. I wonder if we have more files, one that does collection, one with parsing and ordering and we leave the registration in hcp?
I am undecided about how exactly to split the classes.
For now I just wanted to see that we _can_ split this up, so that ModuleHandler no longer has to deal with a compiler pass directly.
A separate HookCollector feels like a more generic class to use in different places.
Currently we still have lots of stuff in that class. So indeed we might want to move some of it back into HookCollectorPass, anything that deals with the container.
But, any decision we take now about the final shape of HookCollector might become obsolete with further changes.
I created new branches with tests.
For now I created an MR only for one of them, for tests that include the 3485896-attempt-simplification and do not include my own changes.
In general I think this is direction that these tests should take:
We call a hook or an alter hook, we get a list of called implementations as from __FUNCTION__ or __METHOD__, and we compare that to the expected order.
The test has some additional magic where we want to check that `->alter(['a', 'b', 'unknown'], ...)` produces the same call list as `->alter(['a', 'b'])`.
Currently this is more like a proof of concept, we definitely don't want to merge as-is.
The tests confirm my suspicion:
An order operation with "extra" types has side effects beyond that one implementation being reordered.
We could probably craft something even more clear to confirm this.
The tests would become a lot more interesting if we fix the grouping by module problem in ModuleHandler.
I know this is currently planned as follow-up, but we might actually want to do this first.
Some thoughts about relative ordering
In hook_module_implements_alter(), relative ordering always means to order one module relative to another module.
With the new code, we can order specific implementations relative to other specific implementations, OR to all implementations of a module.
As mentioned earlier, ordering relative to a specific implementation easily breaks when that implementation gets renamed.
Ordering relative to all implementations of a module might have unintended effects if one of these as a "first" or "last" order applied.
In my older proposal, there was the possibility to order relative to a module's default odering position.
So, even if the implementations of that module get moved around with first, last, before or after ordering, we still only order relative to the original position.
So, let's say:
- We have modules A, B, C, D, in that order.
- We have hook implementations A1, B1, B2, B3, C1, D1.
- We have ordering info "B1 first" and "D1 before B*".
With the current proposal we get D1, B1, A1, B2, B3, C1.
If "before B*" is changed to "before the default position of B", we get B1, A1, D1, B2, B3, C1.
We can even make this work if none of the B* implementations remain in the default position.
Another way to approach this is to define an order of ordering operations.
E.g. we could apply all relative ordering (before/after) before all absolute ordering (first/last). But this only gets us so far.
The third MR extracts a new class HookCollector from HookCollectorPass which now does most of the logic and is mostly independent from the container.
https://git.drupalcode.org/project/drupal/-/merge_requests/11415/diffs
If you want a cleaner diff, look at v2 instead:
https://git.drupalcode.org/project/drupal/-/merge_requests/11344/diffs
I would still like to do more changes:
- I think we should use $snake_case for all new local variables and non-promoted parameters. But for now I want to reduce disruption. We can do this once we agree.
- I would like to rename more variables, but again want to reduce disruption.
- More tests.
- Order and alter hook implementations at call time, especially for multi hook alter calls.
Let's keep this as follow-up, with actual measurements.
As said, I suspect the main cost is loading the file. After that, I suspect we can call "new ReflectionClass" and "$reflector->getAttributes()" repeatedly without much extra cost.
Atm I think we should drop everything that smells like additional complexity.
In this case, dropping that symfony reflection cache allows to do most of the calculations without a container.
I suspect that the main cost of reflection is loading and parsing the class file. And of course scanning the filesystem.
After that, I suspect we can call "new ReflectionClass" and "$reflector->getAttributes()" repeatedly without much extra cost.
So we should be really careful about the cost/benefit.
I tried to see what kind of caching symfony does.
I see a runtime cache (in a variable), I don't see anything persistent.
Besides, that symfony code looks quite scary to me, like it does a lot of stuff we don't need.
Maybe the benefit depends on how many times we want attributes from the same class for different purposes.
Regarding a parser instead of php:
In the past I created https://github.com/donquixote/quick-attributes-parser/-
It is definitely faster than nikic php-parser, at least it was at the time.
But I still would not recommend this for performance gains. I only did this because I wanted to support attributes in older php versions.
I refactored this based on a discussion with @GhostOfDrupalPast to use the container reflections that Symfony caches, this should be much faster.
What do we actually gain from this?
Worst case is that a ReflectionClass is created again, after it was already created in symfony.
I suspect the main cost of "new ReflectionClass" is just to load and parse the class file. Once the class is known to php, it will be very fast.
See also https://gist.github.com/mindplay-dk/3359812
symfony ContainerBuilder::getReflectionClass() actually does some crazy magic that we don't need, so it might actually be slower than just creating the reflection class again.
It definitely increases complexity and mysteriousness.
I would like to remove it to reduce the places that get passed a container.
Hi
Just a quick clarification.
If I read it right you can remove Hook priority entirely right? I'd have to take some more time to look at it, but it looks interesting.
Yes, we can remove HookPriority class, I just did not get to it yet.
I don't think you made any substantial changes beyond what you changed in HCP
The main change here was to change the algorithm within HookCollectorPass.
I wanted less interaction with the container, and only deal with priorities once everything is finished.
Once this is complete, we can run the main part of the sorting logic in a clean unit test without too much mocking.
I did try to keep the behavior of the existing MR.
This was successful as far as existing tests, _but_ as mentioned, it could be that additional tests will reveal functional changes.
In this version, we cannot remove the $order->extraHooks / ->extraTypes.
In a future version (which I think we can still do in this issue), we can try sorting at call time, which would allow us to get rid of $order->extraTypes.
(I say in this issue, because I much prefer not add something than to add and then remove.)
Once HMIA is gone this edge case is gone.
We can cross that bridge when we come to it, if we ever do.
Realistically we won't need to. This only takes affect if they are form alters.
To me these are not good answers that justify to "resolve" a review comment.
We are designing a new system, and we want to support BC.
All of this has to work reliably and predictably, otherwise it should not be added.
Form alters are not an edge case. HMIA is not an edge case, it has to be 100% supported until we remove it in 12.x.
Also, the time spent on an issue does not determine the mergeability.
The next steps I see for me:
- Add tests to cover more combinations.
- Further cleanup and refactoring of the code in this issue.
https://git.drupalcode.org/project/drupal/-/merge_requests/11344/diffs
The current version of this (commit af51ab5bb1ef81e35658ffa2dd1514d7757669c2) shows how we can calculate everything first and _then_ write to the container, in HookCollectorPass.
(It also contains some unrelated changes.)
It passes existing tests.
However, it is very possible that we just don't have enough coverage.
In that case, we should improve the tests.
To improve test coverage for different combinations, we can create a hook with a bunch of implementations with different ordering info, where each does "return __METHOD__;", or sets `$arg[] = __METHOD__` for an alter hook.
Then, instead of asserting something in $GLOBALS, we can assert the return value which gives us a list of all the methods that were called, and their order.
There are still more possible simplifications we can do.
donquixote → changed the visibility of the branch 3485896-donquixote-suggested-changes-part2 to active.
$snake_case vs $camelCase
The 11.x version of HookCollectorPass and other classes has a mix of snake and camel case in local variables and parameters, with snake case being the majority.
The MR introduces new variables that are mostly lower camel case, although in HookPriority I see new snake case variables.
I have a personal preference which has shifted over time, currently leaning towards snake case.
But this should not be relevant here.
The coding standards regarding this have been relaxed. I even saw discussions to soften the rule of not mixing within a file. But this does not mean it should be arbitrary in Drupal core.
But I also agree with @nicxvan to avoid renaming until there is broader consensus. Don't rename things just for me.
I can also all but guarantee that anybody trying to use this will probably not need it and it will be wrong.
We can't build this feature for what might be or how people might use it, it's far, far too much of an edge case. If someone opens a bug against it I will happily chase it down and try to find a solution, but we can't make guesses in this area, if it were not for HMIA supporting something like it, we would not have built it in the first place. As it stands we do need it unfortunately.
The bar for this really should be does it work in core, can we address contrib if they need it? The truth is if there is a contrib module doing some ordering on extra types this doesn't work for they can keep their procedural implementation and we can work on solving their use case before Drupal 12.
We are designing a public API here, and anything we create will be very hard to remove later.
We can fix things before a release, but in general we should only commit what we are ready to release.
From what I remember, ckeditor5 would work just fine with "Order::Last" instead of "Order::After".
The main reason why we would try to support this case in core instead of replacing it with "Order::After" is that there are likely more such cases in contrib, and we keep this one as a canary.
(Or maybe we are concerned about BC impact of moving this hook priority? I don't think so.)
What we don't want is add a bunch of complex code and complex extension points, and then if we are worried it might not work we say nobody will use it anyway. If we only care about one specific case in core, we better to some kind of one-off hack.
This makes me wonder: Order::First and Order::Last do not allow to specify additional hook names.
How will first and last work if alter is called with multiple hooks?
To be fair my core contribution experience is more limited than yours
To be clear I don't have any authority and have been a bit out of the loop in the last years.
I mostly express what makes sense to me.
I welcome any opinions from others.
Another higher-level recap.
Basic concepts
(these are just observations to verify my understanding.)
The new attributes and attribute properties allow to specify ordering information for hook implementations, or to entirely remove specific implementations.
An implementation can be first, last, or before/after specific other implementations.
For relative ordering and for removal, other implementations are addressed by hook name, class name and method name.
(Consequence: When the class or method for a hook implementation get renamed, the ordering attributes that target the old name no longer work.)
Alternatively, a relative ordering can address module names, which will target all implementations by that module.
Ordering rules from attributes are applied _after_ the implementations have been reordered with hook_module_implements_alter().
Ordering rules from attributes are applied one by one, in the order in which they were discovered, but with the order on regular Hook attributes being applied first, then the order from ReOrderHook attributes.
An ordering rule applied later will overwrite / win over a conflicting order by a previous rule.
A relative ordering rule can have "extra hooks" specified, which means that this ordering rule should be applied for ->alter() calls with multiple hooks. (more on that below)
Conflict resolution
The resolution of conflicts is quite simple:
Any "later" ordering rule will overwrite any "earlier" ordering rule.
If we want to control this further in the future, we should provide ways to prioritize the ordering rules.
(This could bring us to a similar situation as explicit priorities, but we can discuss that in the future)
Overall this basic concept is fine!
It is different from what I proposed in my older MR, but it is a very valid way to order hook implementations.
Extra hooks logic
Tbh I find it hard to follow the "extra hooks" logic.
I mean I think I understand the intention, but then I have trouble to follow what exactly is happening in the code.
I also suspect that developers who want to apply order attributes may not fully understand the consequences of adding or not adding hooks in the order extra types.
At the same time I had the suspicion that it might not behave as we expect in many scenarios.
But it is hard to point out possible flaws when I might just not correctly understand it.
This is why my earlier comments were primarily focused on docs, naming and possible refactoring.
Naming of methods and variables
(these are generic comments, for specific comments see the review)
My comments about naming may appear nitpicky and pedantic.
In general there is no "perfect" naming, but there are some mistakes we can avoid.
Overall, I like it if the same thing has the same name throughout a subsystem, and more so within the same method, and different things to have different names.
E.g. when I see something like `foreach ($priorities as $weight)` or `foreach ($trees as $forest)`, I get confused.
Of course it is much easier to point out flaws than to propose better names.
Sometimes, the difficulty to find a good name can point to a design issue, where the variable or method should not really exist in that way.
Test coverage
Actually I have not looked at all into the tests.
But I suspect we need more, given that some possible problems I found in the code did not cause test failures.
If we could isolate the calculation some more, it might become possible to cover a lot of stuff with unit tests, which are much faster than
Follow-up issues?
To me, follow-up issues should be for additional functionality, or to fix things that we left unchanged.
I don't like to plan follow-ups to change things from a current MR.
We still have some time until the next release, so we _could_ change or rewrite parts of this ordering in follow-ups, but I would prefer if we can avoid it.
Alternative: Order at call time (cached)
I can imagine one alternative that (I think) would allow to drop the "extra hooks" or "extra types" in the order object added to an attribute.
We could simply apply the ordering rules from attributes at runtime:
At container build time:
- Calculate the order for regular single hook.
- Store ordering rules per hook.
At runtime for a multi-hook alter call:
- Load and merge the implementations for each hook.
- Group by module, apply module order, call HMIA with the primary hook (as was done in older versions of Drupal).
- Undo the grouping by module, so we have a list of individual implementations, by identifier.
- Apply the ordering rules for each of the hooks from the ->alter() call.
- Cache the result.
A tricky edge case here would be if the same class + method implements multiple hooks that are invoked together.
So, if something calls `->alter(['A', 'B'], ..)`, and we have `#[Hook('A_alter'), Hook('B_alter')]` on one method, possibly with different ordering information, then the targeting with class + method would become ambiguous.
To avoid this ambiguity there could be an additional parameter in the Order object, to determine whether we order relative to A_alter or B_alter.
But, I think it would be much better to just disallow that, or to order relative to both.
This alternative will lead to a situation where we order "combined" hooks at call time, but regular hooks at container build time.
We _could_ instead just do all the ordering at call time, which would probably reduce code duplication.
Note that in the current MR we already do ordering at call time for a multi-hook, if we do not find a pre-ordered combined hook.
donquixote → changed the visibility of the branch 3485896-donquixote-suggested-changes-part2 to hidden.
donquixote → changed the visibility of the branch 3485896-donquixote-suggested-changes-part1 to hidden.
donquixote → changed the visibility of the branch 3485896-donquixote-suggested-changes to hidden.
I encountered this myself while working on another core issue.
To me, clearing out all empty arrays after their children have been cleaned up would be the preferable behavior.
Unlike the current behavior, this would be idempotent, which is generally nice to have.
However, this would be a change of existing behavior that we could consider a BC break.
An alternative would be to introduce a separate method which has the desired behavior.
Like NestedArray::filterRecursive() or NestedArray::filterCompletely().
The solution proposed in the issue summary is correct either way: We filter the parent array _after_ filtering the children.
Thank you so much for your review and this comment, you've obviously put a lot of thought into this topic over the years and this issue specifically.
I regret the inconsistency in my level of participation and commitment.
At a high level, I don't see anything here that should block this issue, once this is in there is some opportunity for clean up / expansion, but this is already far broader than a typical issue due to the nature of HMIA. I truly think for this issue the thought should be
In the MR for this issue we are introducing the different Order classes, including ComplexOrder with the ->extraTypes, which would be obsolete if we use a different model / architecture for the ordering.
(I need to do a more detailed analysis of these)
We should avoid adding features and extension points that we later remove.
We don't need to do everything in one issue, but we should know in which direction we are moving.
The initial OOP hook issue did use explicit priorities and that blew up which is why it was pulled out, this issue was specifically borne out of a BC way to allow ordering relatively.
Can you remind me what is the "initial OOP hook issue" and how it failed? Maybe I missed it.
(I assume we are not talking about my older issue here, which mostly suffered from being overly ambitious. I don't recall that anything "blew up".)
This said, I can imagine how this could have failed, depending how it is done.
Yes, hooks are different so the divergence is expected
My point is the more we diverge and the less we reuse from symfony events, the more we have to ask what we gain from using the symfony system, or if it would have been better to do something custom.
Maybe it is too late now.
multiple implementations are new, going to be rare, and ordering them as a separate entity will be even rarer.
The MR provides a way to order specific implementations relative to each other.
If we assume all implementations of one module to have the same ordering position, then all we would need is to order relative to a module, not relative to an implementation.
To me this MR is incomplete if we don't properly support multiple implementations per module. It could even reveal shortcomings in the approach.
Some higher level thoughts.
Divergence from symfony event system
We are using the symfony event system, but then we are reinventing the way that these events are declared, how we specify ordering, and how we invoke them.
Explicit vs calculated numeric priorities
Atm, if a module uses explicit event subscriber priorities, I suspect it does not really work so well in combination with the dynamic/calculated priorities produced by the hook ordering system proposed here.
If we want to stick closer to the event system, then the default way to specify ordering would just be to specify explicit priorities. There are reasons why we might want to avoid this, because modules will come up with random numbers like -999, -5 or -1000000. But this also applies to events that we already have in Drupal, and where we are ok with just priorities.
We could also consider to order event subscribers without setting a calculated priority.
If we have 10 subscribers with prio = 0, we can just order that array and set it, while the priority remains 0 for each.
We could either do this later in the process, when the tagged services have already been evaluated and applied as a service parameter, or we rely on the order of tagged service definitions in the container. Not sure if this works.
Relative ordering
One thing the symfony event system does not cover is relative ordering.
I remember older discussions where it was said the relative ordering is not really needed.
We only had one such case in core (ckeditor5) and here it was not even really required. But perhaps contrib has such cases.
In the past, with no more than one implementation per module (for regular hooks), relative ordering just meant ordering of module names relative to each other.
Even with advanced multi-hook alter hooks, where you could already have multiple implementations per module, we would group them by module for ordering, to have something in the format we can pass to hook_module_implements_alter().
Now we want to order implementations relative to each other, which is more than what we would need if we only want to preserve the old/existing ordering functionality.
In my older MR, we would order implementations relative to module names, which completely eliminates the possibility of circular conflict.
Module name per implementation
Also we need to know the module name for each implementation, which symfony does not really help us with.
We have to store the impl -> module map in a separate place, outside the actual event subscribers list.
Order lost when grouping by module
Currently in ModuleHandler::getHookListeners(), we are grouping implementations by module:
$this->invokeMap[$hook][$module][] = $callable;
This means that previous ordering by implementation is destroyed, if we have multiple implementations for one module.
A solution here could be to let ::getHookListeners() return an iterator, which would allow to repeat the same key.
yield $module_A => $module_A_early_implementations;
yield $module_B => $module_B_implementations;
yield $module_A => $module_A_late_implementations;
Of course the `$this->invokeMap` would need to have a different structure in that case. Perhaps it could contain iterator closures, or IteratorAggregate objects.
Ordering conflicts
With the system proposed here, there is always the potential for ordering conflicts.
E.g. A wants to run before B, but B also wants to run before A.
This type of conflict already existed (in theory) with hook_module_implements_alter(), and it almost never was a problem in practice, so I don't see it as a blocker.
And then we can have different implementations that want to run first, in that case the original module order decides.
Again, we already know this edge case problem from hook_module_implements_alter(). But explicit priorities or weights would avoid this fully.
Alternative
I want to recap the ordering logic of my older PR, from what I remember.
This was done without the event system, but in fact it would be very possible to integrate it with the event system, similar to what we do now, but without calculated priorities.
Basically, we have different levels of ordering, similar to semantic versions:
- Explicit numeric priority or weight. This would beat everything else.
- Altered order of modules. This would start with the default module order by module weight, and can be altered with hook_module_implements_alter(). This could either be applied for each weight/prio group, or only for priority = 0.
- Relative ordering relative to a module name. Instead of asking to run before or after a specific implementation, ask to run before or after a specific module's regular implementations, if the priority/weight is the same.
So it would look something like this:
// $impl_by_module order has been altered by hook_module_implements_alter().
foreach ($imp_by_prio as $prio => $impl_by_module) {
foreach ($impl_by_module as $module => $module_implementations) {
foreach ([...$before[$prio][$module], $module_implementations, ...$after[$prio][$module]] as $impl) {
yield $impl->module => $impl->callback;
}
}
}
The `$module_implementations` would include procedural _and_ OOP implementations, and a derivative of it is passed to hook_module_implements_alter().
This allowed to convert a procedural implementation with no impact on the ordering.
(The MR looked more complex due to micro optimization.)
This would also completely solve "combined" alter hooks, because we can just combine implementations with same prio and module.
I think the "artifacts" message is a distraction, main thing is to fix the test.
Possible follow-up: Completely remove PHP 8.1 support.
In the pipeline this is already the case (I think), but local tests could still run with 8.1, and websites could run on 8.1.
This is why I don't want to do it here, to keep the impact minimal.
See this for different versions of the message by php version:
https://3v4l.org/N7Mc4
For the real test fail:
DeprecationHelper::backwardsCompatibleCall(
\Drupal::VERSION,
'10.4.2',
fn() => '"foobarbaz" is not a valid backing value for enum Drupal\cas\Model\CasProtocolVersion',
fn() => '"foobarbaz" is not a valid backing value for enum "Drupal\cas\Model\CasProtocolVersion"',
),
Seems like we should distinguish by PHP version, not Drupal version.
donquixote → created an issue.
donquixote → created an issue.
Also wondering why login and logout are on the same service - they have quite different dependencies, e.g. loggout out doesn't need the entity storage.
I think I would prefer to see two distinct services.
We can add tests if we are happy with the direction.
The LogoutController->getServerLogoutUrl() should perhaps move to a service.
But this would add BC risks to this MR.
donquixote → created an issue.
donquixote → created an issue.
Let's see if anything wrong with this direction, then I can add tests.
Related but not same: #3196381: Forced login missing cache metadata → .
donquixote → created an issue.
A lot of standard entities, such as nodes and taxonomy terms, return string IDs.
These are stringified integers like '123' in php, a result of how we fetch the values from the database, even though in the database they are stored as proper integers. All of these would benefit from proper casting, at the cost of some possible BC hickups for code that relies on the stringified values.
I was talking about whether we have content entity types that allow non-numeric string ids like 'xyz'.
I guess contrib could create such types, in theory?
Btw for `->getOwnerId()`, it will return real 123 when the value was explicitly set earlier, but string '123' when it was loaded from the database.
For `->id()` I don't think we can explicitly set it, so it would always have the db value..
One idea to bring this to entity interfaces:
Add `@template TId of int|string` on EntityInterface, then use `@template-extends EntityInterface` or `@template-extends EntityInterface` on specific entity interfaces.
Not sure if there are content entity types that have string ids, or config entity types with int ids.
I would suggest to use some kind of feature detection instead of version detection.
This way we also support sites that apply a patch.
Not sure yet how that would work..
donquixote → created an issue.
Everything old is new again
Right, but now is as good a time as any to address it..
we might even backport to 10.x and/or 11.1.x pre OOP hooks.
Profiling PRs
I pushed two branches for profiling:
2865609-invokeAll-profile - adds profiling and does not include the change
2865609-inline-invokeAll-profile - adds profiling and includes the change
The profiling is for an invokeAll('theme') that actually never happens like this.
I only picked hook_theme() because it is the first that came to mind that has plenty implementations and can be executed without doing lots of other stuff first.
Arguably hook_theme() is a bad choice, because a lot of the time will be spent on the array merging, which we are not optimizing in this version of the MR. Still, if we can measure a difference here, then the experiment is not completely useless.
The way to run the profiling is to execute the kernel test.
The failure message reports different profiling metrics.
To compare, repeatedly switch branches and run the profiling kernel test.
Profiling results
For me, I get:
- Without the change, I get 'repeat.best' in the range of 54 to 56 microseconds.
- With the change, I get 'repeat.best' in the range of 52 to 55 microseconds.
There are outliers, but it is relatively consistent.
The difference between testHookTheme() and testHookThemeWithBogusArguments() is quite small, for me it was around 1 microsecond.
Conclusion
So, the improvement is very small, but it is reproducible.
I also see no reason to assume that it would cause slowness in other areas.
For complexity, I think the changed version is actually better, because you don't need to read two methods to see what is going on.
Some feedback.
I don't like attributes that are not related to the symbol they are attached to.
#[ReOrderHook] and #[RemoveHook] can be placed anywhere with the same effect, so perhaps there should be a different way to do this?
First version does not do anything about NestedArray::mergeDeep().
Imo we could postpone this to a separate issue, just to keep things incremental.
Replace isset($result) with NULL !== $result.Reviewing this, it seems that the benefits are negligible and can throw warnings where isset cannot.
The performance benefit would be negligible, true.
isset() treats undefined like null, it does not produce a warning if a variable or array offset does not exist.
Whenever we know for sure that a variable exists, we should use === NULL or !== NULL over isset(), it is more explicit and we learn quickly if a variable is misspelled, both from static analysis (IDE, phpstan) and from the php warning.
Actually, if we inline most of the stuff in invokeAll() we probably get some improvement.
We can do this in the other issue
📌
Optimize ModuleHandler::invokeAll()
Active
.
Perhaps the remaining benefit from separate methods per return value behavior would have diminishing returns after that other optimization.
And yes this part still applies with the new event hook system.
The most obvious first step here would be a dedicated logic for hooks with no return value and regular passing of parameters, so the first group in my previous comment.
I am not sure how much improvement this would bring, seems like a micro optimization.
If we only measure the raw overhead from the module handler, it will make a difference. If we measure total request time it will be overshadowed by other things.
Looking at ->invokeAll() calls, here are some examples for the most common types.
No return value:
hook_cache_flush()
hook_rebuild()
hook_(entity|ENTITY_TYPE)_(presave|insert|update|predelete|delete|revision_create)()
hook_entity_bundle_create()
hook_entity_bundle_delete()
Merge:
hook_updater_info()
hook_filetransfer_info()
hook_requirements()
hook_update_requirements()
hook_entity_preload()
hook_(entity|ENTITY_TYPE)_(access|create_access)()
hook_field_info_max_weight()
hook_entity_extra_field_info()
hook_entity_operation()
We could investigate if these all need the same merge logic.
I am looking at ->invokeAllWith() calls now..
For the inheritance problem:
I just found out that we can put attributes on variadic parameters :)
class B {
public function __construct(
protected readonly ServiceForB $serviceForB,
) {}
}
class C extends B {
public function __construct(
protected readonly ServiceForC $serviceForC,
#[AutowireParentArgs]
mixed ...$parent_args,
) {
parent::__construct(...$parent_args);
}
}
If we could create this attribute or find out that it already exists somewhere...
@deepali sardana
Please review the existing patches first, before you post your own.
Quick idea,
would it make sense to return a markup object with cacheable metadata, instead of a render element?
$markup = $renderer->renderRoot($element);
assert($markup instanceof MarkupInterface);
assert($markup instanceof CacheableDependencyInterface);
An alternative would be to put #[TrustedCallback] on the class itself. Which means we would have to change the #[Attribute] attribute on that attribute class.
But I think it is simpler and less "special" to just have the attribute on the __invoke() method.
@mdsohaib4242
Actually this does not do anything, because at the time you add your own '#process' callback, the '#process' array is still empty.
The would-be-replaced callbacks are added later by the form builder which is after hook_form_alter() based on $element['#type'].
@donquixote which MR? The newer one in draft?
Well, I am proposing to use the newer MR which is in draft.
Imo the old MR is a dead end, because it breaks existing behavior and makes it impossible to fully remove/replace the callbacks from #type.
But I don't want to remove the old one yet, unless everybody agrees on the new direction.
Having both means we can discuss which direction is preferable.
I notice that '#pre_render' does not allow object with __invoke() method, if it is not a closure.
See 📌 Support object with __invoke() for #pre_render. Active
(We can eventually handle #pre_render in a separate issue, but for now I find it interesting to include it here as proof-of-concept, to see the bigger picture of where this is going.)
donquixote → created an issue.
If multiple operations (e.g. form alter hooks) want to add '#process' callbacks, while preserving the element type default callbacks, they need to make sure that the element defaults are not executed more than once.
The following pattern works in most cases:
$element['#process'] ??= [new ElementTypeProcessCallback($this->elementInfoManager)];
$element['#process'][] = 'my_process_callback';
Can't Element::PROCESS_CALLBACK_PLACEHOLDER be a method that just does this:
Yes, this would also do the trick.
However, it does require a call to \Drupal::service(), if it is to be a constant.
Or alternatively, we could make this a class with __invoke() and DependencySerializationTrait. Because these things tend to get serialized. Actually not so bad!
Actually one more problem with this approach.
If multiple operations (e.g. form alter hooks) want to add '#process' callbacks, while preserving the element type default callbacks, they need to make sure that the element defaults are not executed more than once.
If the callback is a constant, we can easily check for in_array().
But if it is an object?
Actually, in_array(*, *, FALSE) applies weak object comparison, and this actually seems to work in this case.
But I am a bit afraid of weak object comparisons, I don't trust them.
Perhaps we want to move the ElementProcessCallback* classes to `Drupal\Core\Form\` namespace?
For now I had it in `Drupal\Core\Form\Render\Callback`, because form element type plugins are typically in the same namespace as render element type plugins.
I ma not changing it now because I want to see more feedback first.
We can't use readonly in combination with DependencySerializationTrait.
I am setting to "Needs review" for one round of feedback, but I am sure we will have to change some parts.
Here we go, some unit tests.
Let's discuss if this is a good direction, and how we can refine it.
I created a new MR with these placeholder objects following the idea by @joachim.
I want to add tests, but not sure yet where to put them.
call ELEMENTCLASS::getInfo()
Just to clarify, we would not call it like this, because it is not a static method.
Instead, we would call `$info = $this->elementInfoManager->getInfo($element['#type'])` as we already do in FormBuilder and Renderer.
I am working on something, just need to add tests..
@joachim
Can't Element::PROCESS_CALLBACK_PLACEHOLDER be a method that just does this:
1. look at #type in the render array it's working on
2. get the Element plugin class for that #type
3. call ELEMENTCLASS::getInfo()
4. get the callback names from the returned info array
5. call them
Yes, this would also do the trick.
However, it does require a call to \Drupal::service(), if it is to be a constant.
Or alternatively, we could make this a class with __invoke() and DependencySerializationTrait. Because these things tend to get serialized. Actually not so bad!
The bigger issue is that what you really want is a response containing a predictable interface.
So maybe a name of Drupal::serviceByInterface() would be better?
Indeed Drupal::serviceByInterface() would be a more accurate.
Basically you use the type that you would also use as a parameter type in a constructor that expects that service.
So it is the equivalent to autowire.
The reason I named it serviceByClass() in the MR is that:
- Some services don't have an interface. Especially in custom code or in early versions of a module this might be the case. If constructor parameter types use the class name, then any decorators also need to extend that.
- The term 'class' is used as a catch-all term in many places, where traits and interfaces are also included. E.g. "ReflectionClass" (php native), "class-string" (phpstan), "ClassLoader" (composer), etc.
In my practice, I definitely find myself wanting something like this but also my IDE knows wha to expect once I put in an assert() on the next line.
I tend to do either assert(* instanceof *) or an inline /** @var */ doc.
The problem with both of these is that they are more verbose, and they require a local variable, where otherwise I could use method chaining like Drupal::service(Abc::class)->foo();.
This has the benefit, also, of validating that I am getting the service I expect during tests and local development.
I use assert() in particular to check the order of services when I have multiple layers of decoration
Ok there are two different goals / cases here:
- In a test, I may want to assert the exact class of the service, to make sure my decorators work correctly. In that case I would use $this->assertSame(MyClass::class, get_class($service)); rather than native assert(* instanceof *) or PhpUnit ->assertInstanceOf().
- In my regular code, I would only ever assert($service instanceof SomethingInterface), because other modules might want to add further decorators. This verifies that the DI is not completely giving me the wrong service, but it does not verify that specific decorators are applied.
Changing issue title to be more findable, hopefully
It seems for all the callback types #process, #after_build, #pre_render and #post_render, the placeholder can be a simple identify function.
fn ($x) => $x
(but as regular named function or static method, not as closure)
Also, with this proposed change, it is no longer possible to fully replace or remove the callbacks from '#type'.
A solution could be to introduce a placeholder callback, so that we can:
$element = [
'#type' => 'radios',
'#process' => [
'my_custom_process_callback',
Element::PROCESS_CALLBACK_PLACEHOLDER,
'my_custom_late_process_callback',
],
];
and we get this:
$element = [
'#type' => 'radios',
'#process' => [
'my_custom_process_callback',
[Radios::class, 'processRadios'],
'my_custom_late_process_callback',
],
];
The placeholder should be a valid callback that, if not replaced, just returns the first parameter.
At the same time, it should be a constant that is easy to compare against or look up in an array.
This leaves some design choices to make:
- What exactly is the placeholder value? If it is a static method, how do we call the class and method?
- Where do we do the placeholder replacement? I was thinking of a new service ElementTypePopulator, which handles all of the '#type' replacement, so that we don't have to clutter up Renderer and FormBuilder.
(I did post some proof of concept code in a slack convo, but I am still too undecided about the class naming.)
I removed imports reordering in those files where no new imports are added.
I'd say this issue needs an explanation why the problem occurs with state, but not with keyvalue.
From the comments here I understand it is a race condition, and from looking at the code, I assume it is caused by the cache layer in state, because otherwise state is just a wrapper around keyvalue.
as suggested in the Slack thread
So, this should be in the issue summary..
I do find this issue from April 2024, which might be related:
🐛
[random test failures] Race condition in state when individual keys are set with an empty cache
Fixed
Some people were confused by the title..
Not sure what to make of the test failure.
Let's get one round of feedback first.