Investigate reports of performance drop in 11.1

Created on 27 July 2025, 8 days ago

Problem/Motivation

There was a report on reddit about a significant drop in page requests per second on 11.1
It seems to be specific to cached pages.
Initial investigation seems to be a combination of 📌 Replace ContainerAwareEventDispatcher with Symfony EventDispatcher Fixed and a conversion to OOP hooks 📌 OOP hooks using event dispatcher Needs review

The former seems to have some methods that do not scale to the number of events that Drupal now creates. The issue has a remaining task to do some profiling, but I see no evidence that this profiling was done.

There is this issue in progress to remove events from hooks for a separate reason 🐛 HookCollectorPass registers event listeners but they do not follow the required calling convention Active .

We should do some profiling now to find a long term solution.

Steps to reproduce

https://old.reddit.com/r/drupal/comments/1m8voml/requests_per_second_dro...

Proposed resolution

Use xhprof or https://github.com/fullfatthings/ddev-spx
Profile current 11.x
Profile 11.x with hook event dispatcher removed in 🐛 HookCollectorPass registers event listeners but they do not follow the required calling convention Active .
Profile 11.x with event dispatcher issue reverted 📌 Replace ContainerAwareEventDispatcher with Symfony EventDispatcher Fixed .
Profile 11.x with BOTH.

Remaining tasks

Find way to report profiles.

User interface changes

N/A

Introduced terminology

API changes

Data model changes

Release notes snippet

📌 Task
Status

Active

Version

11.0 🔥

Component

base system

Created by

🇺🇸United States nicxvan

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @nicxvan
  • 🇺🇸United States nicxvan
  • 🇺🇸United States nicxvan
  • 🇺🇸United States nicxvan

    Ok I got spx running, I think we might need to determine what exactly to benchmark and how to report it.

    For now I'm confirming some information from the reddit thread, for example resolveServicesAndParameters is called 1100 times on 11.x with the steps in the IS.

  • 🇺🇸United States nicxvan
  • 🇺🇸United States nicxvan

    Took a quick pass at reverting 📌 Replace ContainerAwareEventDispatcher with Symfony EventDispatcher Fixed to try benchmarking and it didn't work, I'll take another look unless someone gets to it before me.

  • 🇺🇸United States nicxvan
  • 🇬🇧United Kingdom catch

    It looks from the reddit thread that the resolveServicesAndParameters calls take about 2-3ms - the problem with this on page cache hits is that a page cache hit only takes 2-3ms at most, so it's doubling the request time.

    Also 11.1 only has core hooks converted, once you have a site with 2-300 modules, all of which have OOP hooks, that 2ms could turn into 5-10ms or more, which will be noticeable on dynamic page cache hits too (dynamic page cache hits are usually more like 60-70ms, so increasing those by 10-20%).

    One issue I'm seeing profiling is this is running for discovered procedural non-hooks still - uploading some dump output from having a look around. And because those go from collection to service closure, it looks like it results in two calls each.

  • 🇬🇧United Kingdom catch

    Here's a quick go at a revert, it also doesn't work. I get "Unable to find the controller for path "/". The route is wrongly configured." after a drush cr.

  • 🇬🇧United Kingdom catch

    To describe the basic issue:

    We add hooks to the event dispatcher via the 'calls' parameter to the event dispatcher, which calls ::addListener(), this happens every time event dispatcher is instantiated, e.g. on every single request.

    So even though we don't instantiate the individual hook services, we iterate over every hook implementation in core to create a service closure. This involves many recursive calls to ::resolveServicesAndParameters().

    ContainerAwareEventDispatcher on the other hand only creates the service definition lazily - e.g. when a specific event is dispatched.

    Even though the original reason we added ContainerAwareEventDispatcher is no longer an issue (e.g. services are not actually constructed for every listener), the sheer number of event listeners in Drupal core now after OOP hooks means that even creating the service closures on every request is causing a measurable performance regression.

  • 🇬🇧United Kingdom catch

    Here's a before/after xhprof screenshot with HEAD vs. 🐛 HookCollectorPass registers event listeners but they do not follow the required calling convention Active .

    Note that profiling differs whether you have dev dependencies installed or not, at least partly due to OpenTelemetry SDK, I think this was with dev dependencies installed, but the reduction is about the same even if the totals are different.

    So 🐛 HookCollectorPass registers event listeners but they do not follow the required calling convention Active is removing about 2/5ths of the function calls and about half the wall time.

    If 📌 Replace ContainerAwareEventDispatcher with Symfony EventDispatcher Fixed is going to be non-trivial to revert, then we could potentially profile 10.3.x just before/after that commit to see what the impact on cached pages is there. That would give us an idea whether it really is worth reverting or not.

    One thing I noticed during debugging that a lot of event listeners we have in core (that aren't hooks) are core event listeners for core events, we could also reduce that list by converting some core events to hooks 😇.

  • 🇺🇸United States nicxvan

    Thank you!

    A couple of quick notes in passing.

    I got past those errors locally but didn't save them cause it still felt wrong i had to clear cache a dozen times.

    I only tested 11.x and the remove events from hooks.
    11.1 will be far worse because the skip flag isn't set and preprocess is now picked up.

    11.1 might not be able to backport the event extraction for hooks.

  • Merge request !12849Combine calls to addListener(). → (Open) created by longwave
  • 🇺🇸United States nicxvan

    I've been discussing this in slack with @ghostofdrupalpast since he alerted me too the issue.

    #1972300-37: Write a more scalable dispatcher I think shows the dispatcher will be problematic on any sites with many events.

    I think this is more evidence that the underlying issue is the dispatcher change linked in the issue summary.

  • 🇬🇧United Kingdom longwave UK

    This MR combines the calls to addListener() which removes ~670 method calls locally, but it doesn't really make that much difference.

    There is nothing stopping us having multiple event dispatchers; I am wondering if we should put hooks on their own, separate dispatcher.

  • Pipeline finished with Failed
    8 days ago
    #557966
  • 🇬🇧United Kingdom longwave UK

    MR!12850 is promising locally. Testing on the Umami homepage, warm cache:

    11.x:

    Wall Time 11,910 µs
    CPU Time 9,726 µs
    Memory Usage 3,249,088 bytes
    Peak Memory Usage 4,462,840 bytes

    670 calls to EventDispatcher::addListener()

    MR!12850:

    Wall Time 8,083 µs
    CPU Time 6,179 µs
    Memory Usage 2,082,648 bytes
    Peak Memory Usage 2,481,720 bytes

    151 calls to EventDispatcher::addListener()

  • Pipeline finished with Success
    8 days ago
    Total: 652s
    #557970
  • 🇺🇸United States nicxvan

    Separate hooks from events was more about the signature being problematic.

    However, I discussed that issue with someone last week who wants to remain anonymous that convinced me that the signature mismatch is not problem from a practical standpoint.

    But for performance reasons shown here it is crucial.

    The problem with that and the MRs here is they are both just treating the symptom not the cause.

    Symfonies event dispatcher just doesn't scale to hundreds of events.

    ContainerAwareEventDispatcher on the other hand scales to thousands.

    The comment I linked in 15 shows that the changes here or separating hooks from events only kicks the can down the road a bit.

    Larger and more complex sites will still hit a wall.

    The issue removing it clearly calls or the performance concerns which were not tested.

    I think the solution is a revert.

  • 🇩🇪Germany donquixote

    The page that you are testing, are any hooks called on a warm cache?
    And is ModuleHandler instantiated at all in these requests?

    If ::addListener() calls are what slows down these requests, maybe we could go even further and have a separate event dispatcher with only those events that are commonly invoked in requests with warm cache?

  • 🇬🇧United Kingdom longwave UK

    It's the Umami homepage as anonymous, so on a warm cache no hooks should be invoked, hence the drop in calls to addListener. As I understand it from the Reddit thread and the issue summary it's cached pages that are the problem here?

    As soon as a hook is invoked I think this will make no difference, because instantiating the hook event dispatcher goes back to being expensive.

  • 🇬🇧United Kingdom longwave UK
  • 🇬🇧United Kingdom longwave UK

    longwave changed the visibility of the branch 3538212-combine-calls to hidden.

  • 🇬🇧United Kingdom longwave UK

    longwave changed the visibility of the branch 3538212-investigate-reports-of to hidden.

  • 🇺🇸United States nicxvan

    Yeah that's why I think switching this in practice doesn't help us.

    Either we revert to container aware or we pull hooks out (or both).

    If we don't revert we should go upstream and try to get a fix.

  • 🇬🇧United Kingdom longwave UK

    The problem is when can we make these changes. The bigger the change the more likely we have to wait for 11.3 but given this is a regression maybe we want a stopgap fix in 11.2?

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    ghost of drupal past made their first commit to this issue’s fork.

  • Pipeline finished with Failed
    8 days ago
    Total: 252s
    #558101
  • Pipeline finished with Failed
    8 days ago
    Total: 192s
    #558102
  • Pipeline finished with Failed
    8 days ago
    #558104
  • 🇬🇧United Kingdom catch

    I'm not too worried about an 11.2 backport, this is a regression for cached pages, but it's a regression from very fast to still quite fast. It'll be barely measurable for requests that don't hit the page cache (partly why we haven't noticed it until now) and it's all CPU rather than i/o. If we end up with something backportable that's great but more interested in the longer term solutions.

    Either we revert to container aware or we pull hooks out of event dispatcher (or both).

    So I think if we revert to container aware event dispatcher, then we should have almost zero overhead from having hooks in event dispatcher, but we could still move them out for other reasons.

    If we move hooks out first, then the regression from not using container aware will be whatever it was in 10.3 when the initial issue to not use it was committed, which seems relatively small but also worth fixing. While I might have predicted the future I also wasn't thinking about cached pages in that comment.

    The problem with that issue and the MRs here is they are both just treating the symptom not the cause.

    This is true but also I think if we know it doesn't scale to thousands of events and we already have a plan to stop putting thousands of events in it, we should definitely do that.

  • Pipeline finished with Failed
    7 days ago
    Total: 195s
    #558145
  • Pipeline finished with Success
    7 days ago
    Total: 1710s
    #558152
  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    There. I reintroduced the container aware dispatcher but with much better tie-in via a new ContainerAwareEventDispatcherDefinition which YamlFileLoader uses. In the original version a new pass was needed instead of Symfony's. No longer. The very definition of the service captures the addListener calls from the Symfony pass and changes them into the argument structure required by ContainerAwareEventDispatcher. This allows keeping all the functionality of RegisterListenersPass including multiple event dispatchers with practically zero overhead. Tests could be resurrected of course and/or tests could be written for ContainerAwareEventDispatcherDefinition but since all of core flows through it, I haven't felt the need.

  • 🇺🇸United States nicxvan

    I will try to do some performance testing against both branches.

    I'll try to find some more intense pages to test too so it's hitting hooks.

    I'll be honest, I think while @longwave's approach is much simpler I think it only helps in one very specific case while @ghostofdrupalpast's seems to be attacking the root cause.

    It still needs validation of course.

    Also while it's more complex, it is more likely able to be cleanly backported whereas putting hooks in their own event dispatcher can only possibly help 11.1+.

  • 🇺🇸United States nicxvan
  • 🇺🇸United States nicxvan

    I didn't mean to delete that file or change status, it was an old tab.

  • 🇺🇸United States nicxvan
  • 🇺🇸United States nicxvan

    I also am adding a permissions page view logged in. I cleared cache then refreshed twice for each.

    Still kind of preliminary, but I think there is a leader.

  • 🇬🇧United Kingdom longwave UK

    If we inject the container and read from it at runtime I think this means we will never be able to make event listener services private by default. I am not sure if this is important.

  • Pipeline finished with Success
    7 days ago
    Total: 512s
    #558457
  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    Note the changes do not invalidate the performance work, they are cosmetic / build time. But now the code doesn't look like something the cat barfed up. That ladder of ifs was hard to comprehend, this is linear.

  • Pipeline finished with Success
    7 days ago
    Total: 560s
    #558472
  • 🇨🇭Switzerland znerol

    Event dispatcher shouldn't be instantiated for cached pages at all. Filed: 🐛 Regression: All stack middlewares are constructed at the same time even for cached pages Active .

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    That's even better but a performance regression is present since 10.3 for any site with a large number of listeners regardless of cached pages. It's possible this issue needs to be retitled for that and the issue znerol linked is the urgent page cache performance regression which is independent after all.

  • 🇨🇭Switzerland znerol

    a performance regression is present since 10.3 for any site with a large number of listeners regardless of cached pages.

    The performance regression was reported for 11.1 and is clearly measurable on 11.1 on cached responses. Is there any evidence that 10.3 suffered a performance regression on cached pages of the same scale?

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    Not on cached pages. The performance regression exists on every page if you have enough listeners.

  • 🇨🇭Switzerland znerol

    Not on cached pages. The performance regression exists on every page if you have enough listeners.

    Then please submit your PR as a separate issue.

  • 🇺🇸United States nicxvan

    Then please submit your PR as a separate issue.

    I'm not sure what you mean by this, the root cause was the change in 10.3 linked in the issue summary but was only exposed in 11.1 due to the oop conversion adding so many listeners.

    Whether this gets backported is a separate discussion, but ghost already submitted a branch that I did some preliminary testing on.

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    @znerol yes, it's possible this issue should be retitled to "revert the event dispatcher" and the issue title from here needs to move to yours.

  • Pipeline finished with Canceled
    7 days ago
    Total: 131s
    #558675
  • Pipeline finished with Canceled
    7 days ago
    Total: 193s
    #558679
  • Pipeline finished with Success
    7 days ago
    Total: 667s
    #558680
  • Pipeline finished with Failed
    7 days ago
    Total: 403s
    #558711
  • The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • 🇬🇧United Kingdom catch

    Trying a re-title.

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    Based on the performance benchmark in the issue summary there is a 10% hit on the admin performance page from the Symfony event dispatcher just with standard without hooks. So I am not sure whether "hundreds" are even needed.

  • Pipeline finished with Failed
    7 days ago
    Total: 294s
    #558966
  • Pipeline finished with Failed
    7 days ago
    Total: 140s
    #558990
  • Pipeline finished with Failed
    7 days ago
    Total: 116s
    #558993
  • Pipeline finished with Failed
    7 days ago
    Total: 203s
    #559004
  • Pipeline finished with Failed
    6 days ago
    Total: 300s
    #559142
  • Pipeline finished with Success
    6 days ago
    Total: 675s
    #559161
  • 🇺🇸United States nicxvan

    This is at least major i think.

  • 🇨🇦Canada Charlie ChX Negyesi 🍁Canada

    ghost of drupal past changed the visibility of the branch 3538212-separate-dispatcher to hidden.

  • Pipeline finished with Failed
    6 days ago
    #559656
  • A few small comments, mostly about typing.

  • Pipeline finished with Failed
    6 days ago
    Total: 126s
    #559813
  • Pipeline finished with Canceled
    6 days ago
    Total: 68s
    #559816
  • Pipeline finished with Failed
    6 days ago
    Total: 310s
    #559817
  • The Needs Review Queue Bot tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide to find step-by-step guides for working with issues.

  • Pipeline finished with Failed
    6 days ago
    Total: 124s
    #559838
  • Pipeline finished with Failed
    6 days ago
    Total: 4516s
    #559842
  • Pipeline finished with Failed
    6 days ago
    Total: 281s
    #559908
  • Pipeline finished with Failed
    6 days ago
    Total: 167s
    #559951
  • Pipeline finished with Success
    6 days ago
    Total: 918s
    #559955
  • 🇺🇸United States nicxvan
  • Pipeline finished with Success
    6 days ago
    #559971
  • Pipeline finished with Failed
    6 days ago
    Total: 142s
    #559996
  • Pipeline finished with Failed
    5 days ago
    Total: 669s
    #560002
  • Pipeline finished with Failed
    5 days ago
    Total: 604s
    #560057
  • Pipeline finished with Success
    5 days ago
    #560069
Production build 0.71.5 2024