Problem/Motivation
I have a site that's having trouble generating sitemaps. This may all be "known" but being as its breaking the module for me I thought I'd lay it out and document everything I found and maybe we can come up with a solution.
After spending some time in blackfire it looks like almost all the time is spent in this line eating 100s of megabytes of memory.
$link_url = Url::fromUri('internal:' . $link['loc'], $link_options + $url_options)->toString();
That looks like a simple enough piece of code. Just make the url into a string! Easy! Turns out internal urls are expensive.
in Url::fromInternalUri
we find this pretty reasonable but more worrying line:
$url = \Drupal::pathValidator()
->getUrlIfValidWithoutAccessCheck($uri_parts['path']) ?: static::fromUri('base:' . $uri_parts['path'], $options);
This method looks pretty normal too but then we see this weird bit of code:
$request = Request::create('/' . $path);
$attributes = $this->getPathAttributes($path, $request, $access_check);
Why do we need a request object? Well... getPathAttributes() is a beast.
Here's the meat of the method:
$path = $this->pathProcessor->processInbound('/' . $path, $request);
try {
$request_context = new RequestContext();
$request_context->fromRequest($request);
$router->setContext($request_context);
$result = $router->match($path);
}
So we basically do a full routing of the request. And it gets worse because as we match the request for entities we trigger a full load of that entity. 10s of thousands of queries pulling in all the fields on your crazy node type just to build a path. Ouch!
Proposed resolution
Looking through some history, this seems to be kinda a quick naive port of the d7 code then patched to support translations. I do wonder if we can't just generate urls based on routes instead where possible. I can't see a reason 'loc' couldn't just have the internal URIs. We'd need to provide BC and upgrade paths but internal:/custom/url
, route:/entity.node.canonical;node=123
would save a _lot_ of time generating URLs because route URI's are significantly cheaper to generate.
Remaining tasks
Decide on a path forward
User interface changes
n/a though it could possibly open some cool features to the UI in the future with the ability to more deeply integrate with url types.
API changes
Possibly different interaction with the 'loc' field.
Data model changes
New column or change to existing column in the database.
Release notes snippet