Try to replace path alias preloading with lazy generation

Created on 28 December 2024, 4 months ago

Problem/Motivation

Inspired by @berdir in ๐Ÿ› Feature or Bug? - The path-alias preloading is not used if the "Enforce clean and canonical URLs" option is enabled Active .

Depends on ๐ŸŒฑ Adopt the Revolt event loop for async task orchestration Active .

We currently have a path preload cache, added by me and others in about 2008. It records all the system paths found on a page when the cache is cold, writes them to a per-URL cache object, then when that page is requested again, if a path alias is requested, uses that cache item to preload all the path aliases.

It saves a lot of queries but is less effective in 11.x than it was in 7.x because we now have render caching and dynamic_page_cache, meaning the amount of times we're generating the same path aliases on the same page will be lower (if still possibly more than once every 24 hours). Also it takes up quite a lot of cache space.

In ๐ŸŒฑ Adopt the Revolt event loop for async task orchestration Active @berdir suggested looking at potentially removing it - this would be fine on sites with good render cache hit rates, but sites with lots of content changes (comment posting on the same URL, node list cache tag constantly invalidated) might still do a fair bit of extra queries.

I think we might potentially have a way to do at least some of the multiple loading, without the caching - if so it would be best of both worlds.

If we look at an xhprof run on a cold cache page request, we can see that all/most of the alias lookups are from UrlGenerator::generateFromRoute()

UrlGenerator::generateFromRoute() builds the URL in two steps.

First it generates the 'system path' from the route. Then it calls PathProcessor::processOutbound() to apply outbound path processors which does amongst other things handling path aliases.

This is all very much per-URL-object and does not provide an obvious opportunity for path alias multiple loading - and while it's changed a fair bit, it's not dramatically different from the original problem we had in 2008.

However, what if we could register system paths onto the path processor - e.g. populate PathAliasManager::preloadedPathLookups as we generate system paths from routes.

Using revolt/Fibers, we could then suspend before running outbound path processors, and allow other code to execute, which potentially will generate more paths, and add them to preloadedPathLookups too. Then when we resume in UrlGenerator::generateFromRoute(), we'd end up calling PathAliasManager::getAliasByPath and that can then multiple load any path aliases that aren't static cached and have been added to preLoadedPathLookups().

Because of the way that e.g. ๐Ÿ“Œ Add PHP Fibers support to BigPipe RTBC was implemented, there is a non-zero chance that we'll be able to multiple load all path aliases that are rendered at the top level of block render arrays and similar.

No idea if this will actually work, but if it does:

1. It will work on completely cold caches, saving potentially dozens of database queries.

2. It won't require maintaining a separate per-page cache item.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

๐Ÿ“Œ Task
Status

Active

Version

11.0 ๐Ÿ”ฅ

Component

routing system

Created by

๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Instead of doing this in the url generator we can probably do it in AliasManager::getAliasByPath() instead.

    Add the system path to a class property ready for preloading, Fiber::suspend(), when we get resumed, look for any paths to multiple/preload that haven't already been, multiple load, continue.

    The main question is how many times can we ::suspend() before we resume, but we only need to pick up two path aliases each time to reduce the number of queries by 50% so the odds might be decent. The existing solution has built in staleness where subsequent page requests will individually lookup paths that weren't found when the preload cache was set.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Put up a proof of concept here but it doesn't do anything useful on standard yet - almost nothing is rendered in a placeholder and that's the only inside-fiber rendering we do.

    So this will need ๐Ÿ“Œ Create placeholders for more things Active - at least a rebase of the MR there and some combined testing, in order to see whether it can be viable.

    I did add some views blocks (recent comments, recent content) to the footer along with adding some alias for things like comment/reply, and saw this working - so in theory it has potential just not in practice yet.

  • Merge request !10724Draft: Initial proof of concept. โ†’ (Open) created by catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Another attempt at a better issue title.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    OK this actually works.

    Manual testing with https://git.drupalcode.org/issue/drupal-3496369/-/tree/3496369-with-rend...

    With a completely cold cache on an Umami recipe page, there are 38 calls to Drupal\path_alias\AliasRepository::lookupBySystemPath + the cache get and set for the preload cache.

    With the MR, there are 31 calls to Drupal\path_alias\AliasRepository::preloadPathAlias and no cache get/set.

    With a warm render cache, there will be zero path alias gets at all because the resultant URLs are in the render cache anyway.

    So the worst case scenario is improved, and the best case scenario is the same.

    When the path alias cache is warm but the render cache is not (multiple roles viewing the same path for example), then there will probably be more database queries - but what's in the cache will depend on which role visits the page first, so it's not like preloading gets the queries down to one, there's always some stragglers.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Then I added ๐Ÿ“Œ Create placeholders for more things Active locally and got down to 14 calls to preloadPathAlias() - this is nearly a 2/3rds reduction in path lookup queries on a cold cache. Because the cache is per-URL it will be a similar reduction across all the first visits to a URL on a site after a cache clear or each time the cache expires if the render cache for that page isn't warm."

  • Pipeline finished with Failed
    4 months ago
    Total: 146s
    #385377
  • Pipeline finished with Failed
    4 months ago
    Total: 118s
    #385380
  • Pipeline finished with Failed
    4 months ago
    Total: 106s
    #385390
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • Pipeline finished with Failed
    4 months ago
    Total: 572s
    #385404
  • Pipeline finished with Failed
    4 months ago
    Total: 104s
    #385418
  • Pipeline finished with Failed
    4 months ago
    Total: 454s
    #385422
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    This issue + ๐Ÿ“Œ Create placeholders for more things Active seems to be enough without ๐Ÿ“Œ Render children in fibers Active , which is good because changing the renderer like that is a bit alarming.

    However when testing I ran into the exception that ๐Ÿ“Œ Deprecate RendererInterface::render()'s sole $is_root_call parameter Needs work is trying to remove, so we will need that issue too to make this all work together.

    The actual MR here still has a couple of test failures in path alias tests, but it'll be reviewable by itself without the other two issues - just won't be a performance improvement (or a very marginal one) until those also land.

    Not exactly sure what test coverage looks like for this - we can probably set up some Fibers in a kernel test that call getAliasByPath() and check that the aliases are still returned correctly.

    ๐Ÿ“Œ Add assertions to OpenTelemetryNodePagePerformanceTest::testNodePageCoolCache() Active would hopefully cover performance testing once it's in, alongside the other two issues.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    catch โ†’ changed the visibility of the branch 3496369-with-renderer-support to hidden.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    However when testing I ran into the exception that #2511330: Deprecate RendererInterface::render()'s sole $is_root_call parameter is trying to remove, so we will need that issue too to make this all work together.

    Not entirely proven yet, but I think this might be ::executeInRenderContext() which relies on a (static) stack to track cacheability metadata from code that returns strings, some discussion in ๐Ÿ“Œ Refactor the render context stack to avoid a static property Active .

    I think we might need to do something like use a fiber in executeInRenderContext() so that we can 'trap' any fibers underneath - e.g. if they suspend, resume them again, so that any metadata doesn't leak into the stack.

    Explicitly postponing this on ๐Ÿ“Œ Create placeholders for more things Active .

  • Status changed to Needs work about 1 month ago
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • Pipeline finished with Failed
    24 days ago
    Total: 195s
    #461799
  • Pipeline finished with Failed
    24 days ago
    Total: 177s
    #461803
  • First commit to issue fork.
  • Pipeline finished with Failed
    20 days ago
    Total: 524s
    #466172
  • ๐Ÿ‡จ๐Ÿ‡ญSwitzerland berdir Switzerland

    Worked a bit on this to clean up the unit tests so we can see the other tests now that they are blocking. I adjusted some and remove several that were testing the interaction with the cached preload routes.

    Two performance tests then failed, one with the is root exception, the other changed how the query looked but otherwise pretty minimal impact. It did drop one query though, not exactly sure what happened to that.

    Missed that we we need [#3476421: Add assertions to OpenTelemetryNodePagePerformanceTest::testNodePageCoolCache()] for the scenario where this matters.

    Note that ๐Ÿ“Œ Add database and cache assertions to OpenTelemetryFrontPagePerformanceTest and OpenTelemetryNodePagePerformanceTest Active is kind of a duplicate of that one, should clean up those issues a bit to avoid duplicates.

  • Pipeline finished with Failed
    20 days ago
    Total: 657s
    #466182
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    The is root exception is going to be a hard blocker here (and potentially for async queries too since I think it will hit the same thing).

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Pretty sure this (and potentially any other fiber suspension within placeholder rendering) is blocked on ๐Ÿ“Œ Deprecate RendererInterface::render()'s sole $is_root_call parameter Needs work .

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    If you combine the changes here with ๐Ÿ“Œ Deprecate RendererInterface::render()'s sole $is_root_call parameter Needs work you get a new error:

    AssertionError: Bubbling failed. in assert() (line 634 of /var/www/html/core/lib/Drupal/Core/Render/Renderer.php) #0                    
                                             /var/www/html/core/lib/Drupal/Core/Render/Renderer.php(634):
    

    But that gave me an idea on how to workaround it, pushing a new branch to see if it helps with the test failures.

  • Merge request !11784Draft: Preload and root โ†’ (Open) created by catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    MR up on ๐Ÿ“Œ Pass RenderContext around in the Renderer Active now. That makes this PP-2, but pretty sure those two issues are the only hard blockers and we mostly need to adjust test expectations here once they're in. Per #18 ๐Ÿ“Œ Add database and cache assertions to OpenTelemetryFrontPagePerformanceTest and OpenTelemetryNodePagePerformanceTest Active is necessary to see the improvement, although the existence of the test means we can run a combined MR to see the difference even if it's not actually committed yet.

  • Pipeline finished with Failed
    15 days ago
    Total: 460s
    #470375
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    With ๐Ÿ“Œ Add database and cache assertions to OpenTelemetryFrontPagePerformanceTest and OpenTelemetryNodePagePerformanceTest Active there is hardly any visible improvement or regression.

    However manually testing with umami + the navigation module enabled, and then truncating cache_render, I do see a couple of path aliases loaded together (but several loaded individually).

    If we were to render individual entities in views listings in placeholders (which we theoretically could), I think we'd see a significant saving though.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Removing 'try to' from the issue title because this definitely works.

    Right now, with the three issues, it works, but only a couple of path aliases get multiple loaded.

    To make it more effective, the following two issues would help a lot:

    ๐Ÿ“Œ Use a lazy builder/placeholder for the top bar Active
    ๐Ÿ“Œ Views entity rendering should use a lazy builder/placeholder Active

    The reason those should make a big difference is because the way this works is by picking up the first link it finds when rendering a placeholder, suspending the Fiber, and then doing that for each available placeholder. When it reaches the first placeholder again (e.g. the Fiber is resumed), then it will preload any path aliases picked up from the first iteration through the placeholders and load them at once. ๐Ÿ“Œ Create placeholders for more things Active did similar for several blocks already, but a lot of blocks don't necessarily have unique path aliases to look up.

    So if you have five placeholders (at the same level of nesting), and each placeholder has one alias to load, you get five aliases loaded in one loookup.

    But if you have a single placeholder with 50 URLs, then it will only multiple load the first one or two, then the rest individually, because it only has one placeholder with that many links in it.

    For e.g. Umami views showing lists of recipes or taxonomy terms, if each of those is a placeholder, then we're looking at 10, 20, 50 path alias lookups at once instead of individually, and crucially, this will work on both render cache hits and misses. Also allows us to remove a high-cardinality, low-hit rate cache item at the same time.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    Cross-linking the original issue that added the system lookup cache for multiple loading of path aliases back in 2009 #456824: drupal_lookup_path() speedup - cache system paths per page. โ†’ . The code has moved around a lot but the logic hasn't really changed, pretty nice to be in a position to remove it with something better, umm, 16 years later.

  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch

    In combination with the two blocking issues, and ๐Ÿ“Œ Views entity rendering should use a lazy builder/placeholder Active , and also ๐Ÿ“Œ Add database and cache assertions to OpenTelemetryFrontPagePerformanceTest and OpenTelemetryNodePagePerformanceTest Active , it's possible to see a real reduction in queries on cold caches now - about 15 less on the Umami front page and node pages.

        Open Telemetry Front Page Performance (Drupal\Tests\demo_umami\FunctionalJavascript\OpenTelemetryFrontPagePerformance)
         โœ˜ Front page performance
    โ”œ Failed asserting that two arrays are identical.
           โ”Š ---ยทExpected
           โ”Š +++ยทActual
           โ”Š @@ @@
           โ”Š  Array &0 [
           โ”Š -ยทยทยทยท'QueryCount'ยท=>ยท376,
           โ”Š +ยทยทยทยท'QueryCount'ยท=>ยท361,
    
     Open Telemetry Node Page Performance (Drupal\Tests\demo_umami\FunctionalJavascript\OpenTelemetryNodePagePerformance)
         โœ˜ Node page
         โ”
           โ”œ Failed asserting that two arrays are identical.
           โ”Š ---ยทExpected
           โ”Š +++ยทActual
           โ”Š @@ @@
           โ”Š  Array &0 [
           โ”Š -ยทยทยทยท'QueryCount'ยท=>ยท458,
           โ”Š -ยทยทยทยท'CacheSetCount'ยท=>ยท441,
           โ”Š +ยทยทยทยท'QueryCount'ยท=>ยท443,
           โ”Š +ยทยทยทยท'CacheSetCount'ยท=>ยท440,
    
    
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
  • ๐Ÿ‡ฌ๐Ÿ‡งUnited Kingdom catch
Production build 0.71.5 2024