Add an allowlist for incoming path aliases

Created on 23 February 2024, 4 months ago
Updated 26 February 2024, 4 months ago

Problem/Motivation

Discovered via 📌 Log every individual query in performance tests Needs work where we're observing a URL alias lookup for sites/default/files/css/long-filename which is never, ever going to be aliases, and also rss.xml which could be aliased but isn't on most sites.

Every http request to Drupal, we check if the incoming URL is a path alias, this is an uncached (and mostly uncacheable, even from the query cache) database query on every page.

For alias lookups, i.e. outgoing path aliases, we have the PathAliasWhitelist cache collector. This stores a list of 'first path parts' (top level directories) like 'node', 'user', 'admin' and whether there are any aliases under that. So if there is one alias for node/1, we'll lookup aliases for anything under node/*, but if there are no aliases for user/*, we don't look up any aliases for those. Note that this class shouldn't be called 'PathAliasWhitelist', it should be PathWhitelist or probably PathAllowlist.

We could apply the same pattern to incoming aliases too.

For example, if there is no alias that starts with admin/*, then we don't need to look up an alias for admin/config, admin/structure etc., we could just skip it altogether. Same for users which often aren't aliased.

Core also looks up aliases for rss.xml on the standard profile front page, because that link is generated with internal:rss.xml which triggers incoming path processing.

Steps to reproduce

Proposed resolution

Consider adding an AliasAllowList service which dynamically builds a list of 'first path parts' for aliases. When an incoming path is requested, check the path against the list, and skip the alias lookup if it's not in there.

Remaining tasks

There is a possible way this can go wrong.

Most sites namespaces most of their path aliases - i.e. /article/the-article-title, or /2012/the-article-title or something like that. With maybe a few custom one-off aliases for landing pages like /about - this will work fine and result in a small array.

But it's possible to configure aliases like /the-article-title /something-i-wrote /another-random-title with pathauto which could lead to thousands of 'first path parts'. Might require some kind of limit in the service container that we can set to 500 or 1000 or something just to stop it completely getting out of hand if a site does this.

User interface changes

API changes

Data model changes

Release notes snippet

📌 Task
Status

Active

Version

11.0 🔥

Component
Path  →

Last updated 6 days ago

  • Maintained by
  • 🇬🇧United Kingdom @catch
Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Comments & Activities

Production build 0.69.0 2024