[PP-1] Create the database driver for MySQLi

Created on 21 January 2022, almost 3 years ago
Updated 11 June 2024, 5 months ago

Problem/Motivation

Running database queries as async queries lets us run them in parallel instead of sequentially. For pages with multiple "larger" queries this can speed up rendering such a page significantly. The problem is that the 3 by core supported database driver are all PDO drivers which do not support async queries. The PHP MySQLi extension does support async queries.
The other thing we need is PHP 8.1 with the Fiber functionality. Creating a database driver for MySQLi is the first step we need to make to start using async queries in Drupal.

@catch explained it as following:

Let's take something like https://www.drupal.org/dashboard โ†’ - there's four or five different views on there in blocks.

If you hit a completely cold cache, you have to execute all the views building those blocks, including both the listing query, then rendering the entities within them.

The idea being explored is that for each lazy builder - let's assume we're able to loop over them, you'd render it up until you have to run a views listing query. When it's time to run that query (or whatever appropriate point to delegate to the Fiber), it's done inside a Fiber with a non-blocking query. You then ::suspend() the Fiber and return to the main process, and start the next lazy builder, if that also has to run a views listing query, you'd also run it non blocking and ::suspend(), then you loop over all the lazy builders again to see if the various queries have come back, then render the results for anything that has, then continue polling until everything is rendered.

The page still can't finish rendering until the slowest query is finished, but if you have five slow queries that take 500ms each, then you might be able to run them all at once in the space of 600ms-1s instead of definitely having to run them sequentially in 2.5s. While you're running those simultaneous queries, you can also get on with PHP/CPU-intensive tasks like rendering - assuming we're able to arrange things so that these can be interleaved.

Fibers only allow us to paralellize i/o, not CPU, because there's not any parallel execution at all, so it has to be something like an async MySQL or SOLR query or http request to make any difference, and there has to be something to be getting on with while it's running - either more i/o or something else.

Ideas template

What is the problem to solve?

It is very easy to configure Drupal in such a way that it generates slow database queries, views allows complex joins, sorts, aggregates. Sometimes these queries will be unoptimized, but still run quite quickly on a small dataset or when the database isn't under load, then as sites grow (in size and/or traffic) they start to show problems. We also ship the SQL-based search module in core. Once you add in blocks and layouts, there can be several slow listing queries on a single page.

Equally, Drupal relies a lot of discovery (theme registry, plugins etc.) which tends to be CPU intensive on cache misses and take a long time. And our rendering pipeline, even on only a render cache miss is quite expensive due to the granularity at which we render HTML.

PHP 8.1 added support for Fibers, which allows applications to execute non-blocking operations (like an async MySQL query), and then do something else (execute something else non-blocking, or something CPU intensive) in between firing the query and it being returned. This was previously possible with applications like amphp and reactphp but not in a way that was suitable for Drupal - your application would have to be built around those libraries.

Fibers allows the following:

1. I have a set of things (in Drupal's case, render placeholders are the first and most useful example) that I need to process all together. If one of them executes something asynchronous and suspends a fiber, I can move onto one of the other ones in the meantime. This is ๐Ÿ“Œ Add PHP Fibers support to BigPipe RTBC and ๐Ÿ“Œ Add Fibers support to Drupal\Core\Render\Renderer RTBC .

2. I've just fired off something asynchronous, if I've been called within a fiber, I can suspend it, and then when I'm resumed the query/http request etc. might already have been returned. This is what will be added by this issue.

The very, very important thing is that the code in #1 and the code in #2 don't need to explicitly know about each other or have any interdependencies, this way the application can become Fibers-aware cumulatively and any instance of #1 should work with any instance of #2 and vice-versa.

MySQL supports async queries, however PDO does not. We'd therefore add a driver based on MySQLi which does.

Who is this for?

Site builders and developers who want faster websites.

Result: what will be the outcome?

There will be a choice between PDO and MySQLi drivers when running Drupal with a MySQL database. Because we don't currently require mysqli, we can't just switch the current PDO driver over.Very far in the future we might consider deprecating the PDO driver or and moving it to contrib. the MYSQLi driver will probably be experimental to start with.

If the MYSQLi driver is installed, views (and probably search and entity queries) wil execute its query async if called within a fiber and suspend the fiber - this can be automatic without a configuration option. โœจ [PP-1] Add async query execution support to views Active

How can we know the desired result is achieved?

Once all the pieces are in place, it should be fairly straightforward to show the performance improvement.

How do we expect this to impact Drupal users (site administrators, developers, or site builders)?

They will get an extra option when installing Drupal, depending on whether they have PDO or mysqli installed or both. Some site admins may want to switch their database driver over (possible by enabling the module then changing the databases array).

What kind of challenges do we expect this to cause with users? Does this break any existing installations / use-cases?

An extra choice on installation if your hosting supports mysqli and pdo, otherwise none. It should not break anything (except for regular bugs) and be transparent otherwise.

Does this add new dependencies to Drupal? If yes, is it available broadly enough or are there other concerns with the dependency?

No hard dependencies, but the MYSQLi driver will depend on the php_mysqli extension.

Is this something users would have to install or is it installed automatically for them?

They'll have to select it unless we eventually deprecate the PDO driver.

Are there reasons to not do this in the pre-existing Mysql module?

We need to continue to support PDO since not all hosting has php_mysqli enabled, so it has to be a new driver. However it might be fine to have both drivers in the same module.

Proposed resolution

Add a new MySQLi database driver.

This would initially be in its own module so it can be experimental. We could probably merge the driver into the main mysql module when it becomes stable.

Todo

  • Add tests for #90.2, autoloading of database driver module dependencies an override to when the driver is "mysqli" to also autoload the "mysql" module.

Done

  1. test for #90.1
  2. ๐Ÿ“Œ Refactor transactions Fixed
  3. ๐Ÿ› Implement statement classes using \Iterator to fix memory usage regression and prevent rewinding Fixed
  4. โœจ Introduce database driver extensions and autoload database drivers' dependencies Fixed
  5. ๐Ÿ“Œ Introduce a FetchModeTrait to allow emulating PDO fetch modes Fixed
  6. ๐Ÿ“Œ Deprecate usage of Connection::getDriverClass for some classes, and use standard autoloading instead Fixed

Remaining tasks

TBD

User interface changes

TBD

API changes

None

Data model changes

None

Release notes snippet

TBD

๐Ÿ“Œ Task
Status

Postponed

Version

11.0 ๐Ÿ”ฅ

Component
Databaseย  โ†’

Last updated 2 days ago

  • Maintained by
  • ๐Ÿ‡ณ๐Ÿ‡ฑNetherlands @daffie
Created by

๐Ÿ‡ณ๐Ÿ‡ฑNetherlands daffie

Live updates comments and jobs are added and updated live.
  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024