[PP-1] Async database query + fiber support

Created on 1 September 2023, over 1 year ago
Updated 14 March 2024, 10 months ago

Problem/Motivation

With πŸ“Œ Create the database driver for MySQLi for async queries Active we'll have a database driver that is able to execute asynchronous queries.

I think we need potentially two APIs to handle that.

1. An interface indicating support for async queries, something like:

AsyncInterface{

function executeAsync();
}

As well as something to get the query when it comes back.

Ideally we can encapsulate at least MySQL and PostgreSQL's implementations so they'll work through one API. Core is only likely to need to execute one async query at a time rather than sending multiple, although you never know.

2. A way to conditionally execute a query async when in a Fiber and the driver supports it, this doesn't necessarily need to live in the database API as such.

Could look something like this:


class AsyncQueryHelper {

  public function getAsyncConnectionIfInFiber($database, $target) {
    // Given a database and target, get a dedicated connection for executing
    // an asynchronous database query. This can maintain a stack so that existing
    // connections where the query has already returned can be re-used. Avoids
    // "commands out of sync; you can't run this command now".
    if (\Fiber::getCurrent() === NULL) {
      return Database::getConnection($database, $target);
    }
     return $this->getAsyncConnectionFromStack($database, $target);
  }

  public function executeAsyncSuspendFiber($query) {

    // If the driver doesn't support async quer
    if (!$query instanceof AsyncInterface) {
      return $query->execute();
    }
    $fiber = \Fiber::getCurrent();
    // If we're not within a fiber, we just have to return when the query comes back
   
    if ($fiber === NULL) {
      throw new \Exception('If you're not in a fiber and you have an async query interface, you didn't use getAsyncConnectionIfInFiber so shouldn't be using this API);
    }
    
    // Now we're cookin'.
    $query->execute();
    $fiber->suspend();
    // @todo: infinite loop protection.
    // The query holds a reference to the connection so can we poll the result from that?
    while (!$query->hasResult()) {
      $fiber->suspend();
    }
    return $query->getResult();
  }
  
} 

PHP's MySQL implementation, even with mysqli async support, can only handle one query at a time. If you execute two async queries without waiting for the first to come back (or an async query then a non-async query), you get Commands out of sync; you can't run this command now.

However, this can be worked around via a connection pool.

You have your main, non-async database connection.

When you want to make an async query, you request an async database connection via the skeleton API above. This would use a connection pool - if there's no async connection, create a new one, if there's an existing async connection, check if it's free and use it if it is, if there are existing, busy async connections, create new ones up to some kind of limit, if the limit gets hit, wait until one returns then use it.

Examples of this logic are in https://github.com/amphp/mysql/tree/3.x however we are going to want it to work with Drupal's database API, so probably can't use it directly, however it'll be a good reference point regardless.

The drawback of the connection pool approach is there's overhead in creating a new database connection, but if we only do that for listing queries and similar, then the absolute number of connections should be restricted, and we can artificially cap it too.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

✨ Feature request
Status

Active

Version

11.0 πŸ”₯

Component
DatabaseΒ  β†’

Last updated about 2 hours ago

  • Maintained by
  • πŸ‡³πŸ‡±Netherlands @daffie
Created by

πŸ‡¬πŸ‡§United Kingdom catch

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @catch
  • πŸ‡¬πŸ‡§United Kingdom catch
  • πŸ‡¬πŸ‡§United Kingdom catch
  • πŸ‡¬πŸ‡§United Kingdom catch
  • πŸ‡¬πŸ‡§United Kingdom catch
  • πŸ‡³πŸ‡±Netherlands kingdutch

    I think with the learnings from πŸ“Œ [PP-2] Migrate BigPipe and Renderer fiber implementation to use Revolt Event Loop Postponed I'd propose a different approach:

    All database queries should always be async. This is possible with Fibers (and the Revolt Event Loop) because the code that requires the result of the database query will not be executed until the result is returned. This ensures that while any database query is happening, unrelated code can run (e.g. a cache prewarmer or another bigpipe task). In case a piece of code needs to make multiple database queries and wants to do them at the same time then a primitive such as stream or concurrently can be used to control how they're executed. That would lead to similar patterns as implemented in the BigPipe changes.

  • πŸ‡¬πŸ‡§United Kingdom catch

    All database queries should always be async. This is possible with Fibers (and the Revolt Event Loop) because the code that requires the result of the database query will not be executed until the result is returned.

    There are a couple of limitations which I think point against trying to do this:

    1. PDO doesn't have any async support, we have an issue to add a mysqli driver to core, specifically to enable this issue. So we need a non-async code path for those drivers based on a capability check.

    2.mysqli can only run one async query at a time per connection, it's not particularly well documented, but see for example https://stackoverflow.com/questions/12866366/php-mysqli-asynchronous-que... and https://dev.mysql.com/doc/refman/8.0/en/commands-out-of-sync.html

    This means that to execute an async query, we need to open an additional MySQL connection each time, or re-use a connection from which a query has previously returned.

    Especially with database caching, but also just in general, Drupal can easily execute dozens or hundreds of small database queries on each page request for authenticated users (or just on a cache miss). I think we would need/want to avoid a situation where we have say more than 5-10 connections open per request, since in a stampede we could end up hitting 'too many connections' limits. As soon as we have whatever limit of connections open (say 5) we'd be unable to execute any more until one returns, and we might not have anything else to do except issue more database queries. However, if we only execute async queries for longer database queries (entity queries and views etc.), then all the other ones can be handled by the main, non-async connection (whether that's using the mysqli driver or PDO) even if there are five, slow, async queries running.

    The other reason is that some database queries (again, especially with the db cache but not only) are directly blocking the rest of the request, something like a key value query, path alias, or route lookup. In those cases, we want those (usually sub 1ms) queries to finish as soon as possible so we can move onto the next blocking thing, and wouldn't want to go back into the event loop to do prewarming or whatever in between.

  • πŸ‡³πŸ‡±Netherlands kingdutch

    1. PDO doesn't have any async support, we have an issue to add a mysqli driver to core, specifically to enable this issue. So we need a non-async code path for those drivers based on a capability check.

    I feel like this is missing the beauty of Fiber's in our mental model :D The whole point of Fibers is that we can have things that are async which look exactly the same as sync code: it solves the "What color is your function" problem (see "What problem do fibers solve?" in Christian LΓΌck's great post.

    So where we should end up (and if we don't we did something wrong) is that code calling Entity::loadMultiple($ids) shouldn't care how we load the data (whether that's sync or async) because thanks to Fibers the code writing it can be sure that even if the loading itself happens asynchronously (and other things are done in the meantime) the code in this specific place won't run past Entity::loadMultiple($ids); until those entities have loaded.

    This means that to execute an async query, we need to open an additional MySQL connection each time, or re-use a connection from which a query has previously returned.

    Especially with database caching, but also just in general, Drupal can easily execute dozens or hundreds of small database queries on each page request for authenticated users (or just on a cache miss). I think we would need/want to avoid a situation where we have say more than 5-10 connections open per request, since in a stampede we could end up hitting 'too many connections' limits. As soon as we have whatever limit of connections open (say 5) we'd be unable to execute any more until one returns, and we might not have anything else to do except issue more database queries. However, if we only execute async queries for longer database queries (entity queries and views etc.), then all the other ones can be handled by the main, non-async connection (whether that's using the mysqli driver or PDO) even if there are five, slow, async queries running.

    The other reason is that some database queries (again, especially with the db cache but not only) are directly blocking the rest of the request, including in situations where most caches will be warm, something like a key value query, cache tag checksums, path alias, or route lookups. In those cases, we want those (usually sub 1ms) queries to finish as soon as possible so we can move onto the next blocking thing, and wouldn't want to go back into the event loop to do prewarming or whatever in between.

    I think in the form of a ConnectionPool this is a problem that's long since been solved in other programming languages and we shouldn't re-invent the wheel here. I also don't think we should differentiate between the type of query (because the workload is inherently unpredictable) and all queries should go through the connection pool; especially since there's no query that shouldn't happen.

    From a caller's perspective I don't even think we should know about the ConnectionPool, that should be the database driver's responsibility. Calling code just makes a database call and that'll suspend until data is available and will resume when data is present. Whether that suspension is because there's no connection free or whether the connection is waiting for data from a different connection shouldn't matter.

    For scheduling what code actually gets to load data the EventLoop helps us with priorities based on how things are added to it: EventLoop::queue happens immediately when the eventloop gets controlled; EventLoop::defer happens first in the next tick; EventLoop::delay and EventLoop::repeat happen only after deferred callbacks have run. Responses to streams are also treated as higher priority as far as I understand.

  • πŸ‡¬πŸ‡§United Kingdom catch

    I feel like this is missing the beauty of Fiber's in our mental model :D The whole point of Fibers is that we can have things that are async which look exactly the same as sync code:

    That's not the point though - if the query is synchronous, then we shouldn't be trying to suspend between executing it and it coming back, because there's no opportunity to do so.

    I think in the form of a ConnectionPool this is a problem that's long since been solved in other programming languages and we shouldn't re-invent the wheel here.

    Yes I mention a connction pool in the issue summary, but I don't think we should use that for cache queries (for example), there's literally no point to those being async and it's likely to make performance worse if we do that.

Production build 0.71.5 2024