[PP-1] Async database query + fiber support

Issue created by @catch
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Comment almost 2 years ago →
🇬🇧United Kingdom catch
Comment over 1 year ago →
🇳🇱Netherlands kingdutch
I think with the learnings from 📌 [PP-2] Migrate BigPipe and Renderer fiber implementation to use Revolt Event Loop Postponed I'd propose a different approach:

All database queries should always be async. This is possible with Fibers (and the Revolt Event Loop) because the code that requires the result of the database query will not be executed until the result is returned. This ensures that while any database query is happening, unrelated code can run (e.g. a cache prewarmer or another bigpipe task). In case a piece of code needs to make multiple database queries and wants to do them at the same time then a primitive such as stream or concurrently can be used to control how they're executed. That would lead to similar patterns as implemented in the BigPipe changes.
Comment over 1 year ago →
🇬🇧United Kingdom catch
All database queries should always be async. This is possible with Fibers (and the Revolt Event Loop) because the code that requires the result of the database query will not be executed until the result is returned.

There are a couple of limitations which I think point against trying to do this:

1. PDO doesn't have any async support, we have an issue to add a mysqli driver to core, specifically to enable this issue. So we need a non-async code path for those drivers based on a capability check.

2.mysqli can only run one async query at a time per connection, it's not particularly well documented, but see for example https://stackoverflow.com/questions/12866366/php-mysqli-asynchronous-que... and https://dev.mysql.com/doc/refman/8.0/en/commands-out-of-sync.html

This means that to execute an async query, we need to open an additional MySQL connection each time, or re-use a connection from which a query has previously returned.

Especially with database caching, but also just in general, Drupal can easily execute dozens or hundreds of small database queries on each page request for authenticated users (or just on a cache miss). I think we would need/want to avoid a situation where we have say more than 5-10 connections open per request, since in a stampede we could end up hitting 'too many connections' limits. As soon as we have whatever limit of connections open (say 5) we'd be unable to execute any more until one returns, and we might not have anything else to do except issue more database queries. However, if we only execute async queries for longer database queries (entity queries and views etc.), then all the other ones can be handled by the main, non-async connection (whether that's using the mysqli driver or PDO) even if there are five, slow, async queries running.

The other reason is that some database queries (again, especially with the db cache but not only) are directly blocking the rest of the request, something like a key value query, path alias, or route lookup. In those cases, we want those (usually sub 1ms) queries to finish as soon as possible so we can move onto the next blocking thing, and wouldn't want to go back into the event loop to do prewarming or whatever in between.
Comment over 1 year ago →
🇳🇱Netherlands kingdutch
1. PDO doesn't have any async support, we have an issue to add a mysqli driver to core, specifically to enable this issue. So we need a non-async code path for those drivers based on a capability check.

I feel like this is missing the beauty of Fiber's in our mental model :D The whole point of Fibers is that we can have things that are async which look exactly the same as sync code: it solves the "What color is your function" problem (see "What problem do fibers solve?" in Christian Lück's great post.

So where we should end up (and if we don't we did something wrong) is that code calling Entity::loadMultiple($ids) shouldn't care how we load the data (whether that's sync or async) because thanks to Fibers the code writing it can be sure that even if the loading itself happens asynchronously (and other things are done in the meantime) the code in this specific place won't run past Entity::loadMultiple($ids); until those entities have loaded.

This means that to execute an async query, we need to open an additional MySQL connection each time, or re-use a connection from which a query has previously returned.

Especially with database caching, but also just in general, Drupal can easily execute dozens or hundreds of small database queries on each page request for authenticated users (or just on a cache miss). I think we would need/want to avoid a situation where we have say more than 5-10 connections open per request, since in a stampede we could end up hitting 'too many connections' limits. As soon as we have whatever limit of connections open (say 5) we'd be unable to execute any more until one returns, and we might not have anything else to do except issue more database queries. However, if we only execute async queries for longer database queries (entity queries and views etc.), then all the other ones can be handled by the main, non-async connection (whether that's using the mysqli driver or PDO) even if there are five, slow, async queries running.

The other reason is that some database queries (again, especially with the db cache but not only) are directly blocking the rest of the request, including in situations where most caches will be warm, something like a key value query, cache tag checksums, path alias, or route lookups. In those cases, we want those (usually sub 1ms) queries to finish as soon as possible so we can move onto the next blocking thing, and wouldn't want to go back into the event loop to do prewarming or whatever in between.

I think in the form of a ConnectionPool this is a problem that's long since been solved in other programming languages and we shouldn't re-invent the wheel here. I also don't think we should differentiate between the type of query (because the workload is inherently unpredictable) and all queries should go through the connection pool; especially since there's no query that shouldn't happen.

From a caller's perspective I don't even think we should know about the ConnectionPool, that should be the database driver's responsibility. Calling code just makes a database call and that'll suspend until data is available and will resume when data is present. Whether that suspension is because there's no connection free or whether the connection is waiting for data from a different connection shouldn't matter.

For scheduling what code actually gets to load data the EventLoop helps us with priorities based on how things are added to it: EventLoop::queue happens immediately when the eventloop gets controlled; EventLoop::defer happens first in the next tick; EventLoop::delay and EventLoop::repeat happen only after deferred callbacks have run. Responses to streams are also treated as higher priority as far as I understand.
Comment over 1 year ago →
🇬🇧United Kingdom catch
I feel like this is missing the beauty of Fiber's in our mental model :D The whole point of Fibers is that we can have things that are async which look exactly the same as sync code:

That's not the point though - if the query is synchronous, then we shouldn't be trying to suspend between executing it and it coming back, because there's no opportunity to do so.

I think in the form of a ConnectionPool this is a problem that's long since been solved in other programming languages and we shouldn't re-invent the wheel here.

Yes I mention a connction pool in the issue summary, but I don't think we should use that for cache queries (for example), there's literally no point to those being async and it's likely to make performance worse if we do that.
Comment about 2 months ago →
🇬🇧United Kingdom catch
📌 Create the database driver for MySQLi for async queries Active just landed so this is possible to implement now.

[PP-1] Async database query + fiber support

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

Comments & Activities