- Issue created by @kingdutch
- π¬π§United Kingdom catch
We should add a dependency evaluation to the issue summary, it almost there already, just release cycle and security policy I think: https://www.drupal.org/about/core/policies/core-dependency-policies/depe... β
I still have not fully grasped the benefit of the central event loop vs. for example adding a trait to cover the current raw Fibers implementation (which could then handle sleeping, suspending to any parent Fiber in one place etc. centrally, but would still keep the management of each loop local). For me at least, it would be useful to see a conversion of the manual Fibers loops in core, we also have some examples of suspension in the cache prewarming/stampede issue (if not actual async anywhere yet).
- π³π±Netherlands kingdutch
I've filled out the dependency evaluation.
I still have not fully grasped the benefit of the central event loop vs. for example adding a trait to cover the current raw Fibers implementation (which could then handle sleeping, suspending to any parent Fiber in one place etc. centrally, but would still keep the management of each loop local). For me at least, it would be useful to see a conversion of the manual Fibers loops in core, we also have some examples of suspension in the cache prewarming/stampede issue (if not actual async anywhere yet).
I emailed Aaron with this question and he replied with
Drupal absolutely should use the Revolt event loop. The entire reason Revolt exists is to avoid fragmentation of the event loop component among PHP libraries which want to run asynchronous tasks. The event loop essentially becomes a part of the runtime β you cannot mix multiple event loops in the same application because only one can be running at a time. We talk a bit more about this at https://revolt.run/fundamentals.
Using a propriety loop which schedules fibers would make Drupal incompatible with any library using a different fiber scheduler β i.e. any library using Revolt.
AMPHP might also be useful for some of the primitives it provides, such as Futures and Cancellations, as well as some of the lower-level helper libraries like amphp/pipeline. Note though using AMPHP would be completely optional β Drupal could implement it's own promises/futures, etc. and still be compatible with AMPHP so long as it was using Revolt to schedule events.
Revolt is flexible and un-opinionated, making it easy to create new fibers, use timers, and wait for I/O. Check out the docs at https://revolt.run and let me know if I can provide any additional examples or assistance.
The relevant part of that fundamentals document (in case it changes and someone is reading this in 2025 π):
Every application making use of cooperative multitasking can only have one scheduler. It doesnβt make sense to have two event loops running at the same time, as they would just have to schedule each other in a busy waiting manner, wasting CPU cycles.
Revolt provides global access to the scheduler using methods on the Revolt\EventLoop class. On the first use of the class, it will automatically create the best available driver. Revolt\EventLoop::setDriver() can be used to set a custom driver.
To add to that personally I think with the initial Fiber code that was added we've already seen some challenges with CPU spinlocking. To me this feels very much like a problem where the initial case is trivial and as we adopt Fibers more we'll find more of these edge cases (oh we'd like to just let the system sleep until I/O comes back if there's nothing else to do). We'd then be solving exactly the problems that Revolt has already solved, but doing so in a way not compatible with other async code in the ecosystem.
- π¬π§United Kingdom longwave UK
No current security policy published.
Can we ask the maintainers if they are willing to publish a security policy? Given that this is a low level runtime dependency it seems quite important that if there is a security issue the maintainers are prepared to fix it within a reasonable timescale.
- π³π±Netherlands kingdutch
I've opened an issue with the request: https://github.com/revoltphp/event-loop/issues/87
- π¬π§United Kingdom catch
To add to that personally I think with the initial Fiber code that was added we've already seen some challenges with CPU spinlocking. To me this feels very much like a problem where the initial case is trivial and as we adopt Fibers more we'll find more of these edge cases (oh we'd like to just let the system sleep until I/O comes back if there's nothing else to do). We'd then be solving exactly the problems that Revolt has already solved, but doing so in a way not compatible with other async code in the ecosystem.
I think we could solve some of that by moving the individual loops to use a helper class
so there's less repetition, if that was the only reason I'm not sure it would be worth it, but the interoperability arguments here are quite strong so that is pushing me over from neutral/on the fence towards pro-adoption of Revolt at the moment. - π«π·France andypost
Issue fiixed https://github.com/revoltphp/event-loop/issues/87
- π³π±Netherlands kingdutch
Updated the remaining tasks. I've created a child issue to add the dependency to the composer.json: π Add revoltphp/event-loop dependency to core Active which now also contains the dependency evaluation.
- π³π±Netherlands kingdutch
Updated the issue summary with the remaining tasks to show tasks in progress. At least with the current proposed implementations it appears no work for PHPUnit is needed. If tests want to test something specifically that doesn't block the main thread at some point then they'll have to run
EventLoop::run()
in the test themselves. - π·πΊRussia Chi
That's not clear how will Drupal benefit from this.
This command can give a very rough estimation of how much request time we could potentially save. Compare real and user time.
$ time php index.php > /dev/null real 0m0.242s user 0m0.152s sys 0m0.055s
Yes, there are lots of IO operations but not many of them can be taken out of main code flow because subsequent steps typically depend on previous ones. I believe the earlier mentioned cases (Cache Prewarm and Big Pipe) won't give big performance gain. They need benchmarks to prove their usefulness.
- π·πΊRussia Chi
Yes, there are lots of IO operation
I guess those were mostly file operations performed by Composer autoloader, because in CLI Opcache and Apcu do not make much sense. In real HTTP request served by FPM the difference between
real
anduser
will be much smaller. - π¬π§United Kingdom catch
@chi are those numbers for core or for a real site that you're working on? If for a real site, was it with warm or cold caches? How many entities are there? What sorts of things are on the front page?
If it was for core, try loading a page immediately after a cache clear on a relatively complex site - e.g. with lots of content and a handful of views on the front page.
There are already some performance numbers on the cache prewarm issue.
- π·πΊRussia Chi
Re #13. It was custom project with a few millions entities. However the front page is just a user login form without extra blocks.
Here are results with brand new D11 installation.
Cold cachereal 0m0.638s user 0m0.346s sys 0m0.103s
Warned cache
$ time php index.php > /dev/null real 0m0.123s user 0m0.086s sys 0m0.030s
Though standard profile does render many things on front page.
- π·πΊRussia Chi
@catch Those number are not for measuring site performance. The point was to check how many IO bottlenecks we potentially have. And again checking it in CLI SAPI is not correct.
As of benchmarking, testing just front page may not be sufficient. I think Drupal needs a comprehensive set of load tests that can help to figure out performance gains and track performance regressions.
I created a few K6 scenarios to for testing performance of Drupal site. Wonder if Drupal core could implement something similar as part of CI workflow.
https://github.com/Chi-teck/k6-umami - π¬π§United Kingdom catch
As of benchmarking, testing just front page may not be sufficient.
A page with just a login form is not going to benefit from this.
The sort of page that will benefit is a dashboard-y page like https://www.drupal.org/dashboard β
A landing page with various views.
A content page with a 'related articles' block at the bottom etc.
If you have a page with 2-3 views and/or 2-3 entity queries that is where it gets interesting.
Let's say you have three slow views queries that take ~500ms each. If they are all executed async, then instead of 1500ms executed one by one, the linear time spent executing the queries could go down to ~500ms. Additionally, other CPU intensive tasks (like rendering results) could be happening while waiting for all three queries to come back.
Or if you have 5 entity queries that are 50 ms each, then instead of 250ms one by one, it could be 50ms (or 60ms more realistically given some time to execute each and collect the results) executing in parallel. And again other things can be happening while waiting for them to come back.
If you have just one related articles block that does a views query using similar by terms module taking about 40ms, then that can still be executed async, and for example your footer menu, or social sharing buttons, or whatever other blocks could be rendered while waiting.
On pages like this, unless everything is a cache hit, Drupal is both i/o and CPU bound, but we can do both database query execution and CPU-intensive tasks in parallel once we have async database queries implemented.
- π·πΊRussia Chi
Additionally, other CPU intensive tasks (like rendering results) could be happening while waiting for all three queries to come back.
That's unclear. How do we render results without having those result yet?
Drupal is both i/o and CPU bound, but we can do both database query execution and CPU-intensive tasks in parallel once we have async database queries implemented.
Having async DB driver in place we can run queries in parallel. But is it really possible for CPU tasks like rendering?
- π¬π§United Kingdom catch
That's unclear. How do we render results without having those result yet?
Let's say you have a landing page and it has 10 blocks on it - like a newspaper front page with hero image, breaking news, regional, business, sport all that kind of thing.
Let's say the listing query for five blocks takes 30ms, and for five blocks it's 60 milliseconds.
Rendering the results of each block (loading entities and entity view), takes 30 milliseconds each.
5 * 30ms +
5 * 60ms +
10 * 30ms
= 650msI'm deliberately using relatively short query times here to make the amount of i/o we have to play with fairly conservative.
We have the 10 blocks in a list somewhere, so each one fires off its listing query, and immediately after the query is sent to says to the event queue 'do something else while I'm waiting for the query to come back'.
For each of the 10 blocks, this takes 1ms to go around and fire off each query.
Then we get back to the first block, it's been 10ms since we left it, and the query hasn't come back yet. Because we've got nothing to do, the event loops can either try to prewarm a cache somewhere, or do nothing for a while, it sleeps 0.5ms each iteration if there's nothing to do. Let's assume it does nothing useful for 20ms and just usleeps.
20ms later, the first query comes back,and we immediately load the relevant entities and render the block. This does not happen async as such, but it happens before we check if the other nine listing queries have come back.
This has now taken a total of 60ms wall time to render one block. 30ms to send and receive the initial query, and 30ms to render.
By now, because all of the other async queries we issued take either 30ms or 60ms to return, when we move onto the other nine blocks, all of the query results are sitting there waiting.
Each block takes 30ms to render, this does not and cannot happen async, we just immediately render each block.
So now the entire process has taken 30ms for the initial queries to fire and waiting time with nothing to do + 10 * 30ms to render each block sequentially = 330ms. Almost twice as fast.
In a more realistic situation, there is likely to a much more variable distribution of query times.
So let's say nine queries are 30-60ms but one query takes 250ms and this is the fifth query to run.
In this case, blocks 1-4 and 6-10 might return their results and render first, then finally that result comes back and we render the block.
This could still end up taking only 330ms for the entire process to complete, because 250ms + 30ms < 330ms and we can be rendering all the other blocks while waiting for the slow one to come back.
- π·πΊRussia Chi
Re: #18. We can start rendering blocks with fast queries while waiting for query results from blocks with slow queries? Is it correct?
- π·πΊRussia Chi
Any way for sites where SQL queries is not a bottleneck the only solution would be having additional PHP processes.
Could think of a few options here:
1. Spawning new processes with
proc_open
(Symfony Process). For example, for landing pages you described about those could be very simple PHP scripts that do just one thing - render a single block.2. Rendering blocks with PHP-FPM through hollodotme/fast-cgi-client. That should be faster than CLI workers as it compatible with Opcache and APCu. And supports async requests as well. Also FPM has some options to control number of workers in the pull.
3. Having a few workers connected to a queue. They should listen for jobs (block rendering) and reply instantly. That means Drupal DB based queue is not quite suitable as it would require frequent polling.
4. The most extreme option. Having a multi-thread socket server written in PHP with Drupal bootstrapped. It should be able to handle multiple connections simultaneously. So when a Drupal site needs to render a dozen heavy blocks it just send tasks to that sever to render blocks in parallel.
I suppose nothing of that can be fully implemented in Drupal core. However it could provide some async API to allow delegation of block rendering to external systems.
- π¬π§United Kingdom catch
Re #19 yes exactly.
A persistent queue worker (Drupal as a Daemon) is also a possibility yes. To make they possible we'd need to resolve a lot of issues in core that make that hard to run, and yes the actual implementation probably couldn't live in core but we could make it easier. There are some issues around tracking this stuff.
- π·πΊRussia Chi
Re 18: There are a couple things that can potentially break that calculation.
1. Database locks
When cache is empty the request will likely trigger lots of cache updates that potentially may case raise conditions. Especially when same cache items need to be updated in different blocks.2. Building vs rendering
Those terms are often misused. Building means creating a render array while rendering is creating an HTML presentation of the content (string or Markup object). The problem here is that blocks typically relies on lazy builders, pre_render callbacks, theme functions etc. That stuff delegates rendering process to theme layer.
Consider this render array. It costs nothing for CPU to produce such content. The main work will happen later when Drupal is rendering content. And that means that async orchestration have to cover theming as well.$build['content'] = [ '#theme' => 'example, ]
- π¬π§United Kingdom catch
Building means creating a render array while rendering is creating an HTML presentation of the content (string or Markup object). The problem here is that blocks typically relies on lazy builders, pre_render callbacks, theme functions etc.
With BigPipe, each block placeholder is rendered to HTML independently (and then rendered as an inline AjaxResponse which then replaces the placeholder), so the bit that is controlled (currently by Fibers, eventually by Revolt) incorporates both building and rendering.
Building also generally includes loading entities - e.g. entity query, then load entities, then call view on the entities. So even if the actual rendering happens later, there are things to do in-between querying the entities and returning a render array for them.
When cache is empty the request will likely trigger lots of cache updates that potentially may case raise conditions. Especially when same cache items need to be updated in different blocks.
This would all happen in the non-async database connection, so I'm not sure why you think it would be different?
- π·πΊRussia Chi
I created a module to test options described in #20. It builds sort of landing page with blocks that can be slow down.
https://github.com/Chi-teck/sample_catalog
Results are quite interesting though predictable.
When blocks are too slow it doesn't matter which way you are doing parallel processing. The results are always quite good. I've managed to get 12x boost using 12 CPU cores. However, when blocks are relatively fast, the cost of spawning new processes becomes significant. In that case the best results were achieved with "co-pilot" server powered by Road Runner. It demonstrated its usefulness even when building each block takes about 5-10 ms.
Overall, landing pages are not the only use case for this. For instance, I have a project with very heavy API endpoint for a collection of entities. Each item in that collection is personalized and is frequently updated. So caching is not possible. Building items in parallel using one of the above mentioned options can potentially improve API performance a big deal.
- π·πΊRussia Chi
This would all happen in the non-async database connection
I meant a single HTTP request without any concurrency. In that case in non-async database all queries will happen sequentially. So no locks expected.
- π¬π§United Kingdom catch
However, when blocks are relatively fast, the cost of spawning new processes becomes significant.
Core will not spawn any new processes, it will be necessary to create a new database connection to run an async query (see β¨ [PP-1] Async database query + fiber support Active ), but everything happens in a single process, this is what Fibers allows for compared to the previous approaches of reactphp and amphp. I think it will probably be possible to add async processing via additional process in contrib though but did not really think that far head yet.
- π«π·France andypost
Faced today with Open Telemetry warning which is caused by required trick to propagate context into newly created Fiber.
User warning: Access to not initialized OpenTelemetry context in fiber (id: 8909), automatic forking not supported, must attach initial fiber context manually in OpenTelemetry\Context\FiberBoundContextStorage::triggerNotInitializedFiberContextWarning() (line 74 of /var/www/html/vendor/open-telemetry/context/FiberBoundContextStorage.php).
So having some predictable API to auto-instrument core is good point to keep in mind adopting the topic https://github.com/opentelemetry-php/context/blob/main/README.md#fiber-s...
- π·πΊRussia Chi
Still trying to comprehend how this event loop will work with new MySQL driver ( π [PP-1] Create the database driver for MySQLi Postponed ). As I understand the
revolt/event-loop
is based on streams. stream_select is essentially a backbone of its async operations. That means the DB driver should be implemented through PHP streams likeamphp/mysql
.
Did I miss something? - π«π·France andypost
There's only one way
mysqli::poll()
and streams now everywhere in PHP - π³π±Netherlands kingdutch
Catch already did a great job explaining in text. If anyone comes across this and is looking for an explanation that includes visuals then I recommend watching the talk I gave at DrupalCon which attempts to explain the scenario's in which the Revolt event loop will help us now and in the future: https://www.youtube.com/watch?v=tfppKrK1zGU
In the past week I've also been a guest on the Talking Drupal podcast where I did my best to answer similar questions that may be asked slightly differently and help it click: https://talkingdrupal.com/474
The question about the Async Database is a good question. The event loop can indeed use streams directly (as e.g. amphp/mysql does). However, with the primitives that the library provides it's also possible to do it in a looping manner. For example:
function drupalAsyncDbHandler(....) { // Start query that requires polling mysqli::startSomething(...); $suspension = EventLoop::getSuspension(); // Check our database connection whenever nothing else is happening. $callbackId = EventLoop::repeat(0, function ($callbackId) use ($suspension) { // Ensure only one instance of this callback runs at at time. // Not needed if we're 100% sure that the rest of this function is synchronous. $ready = mysqli::poll(...); if ($ready > 0) { // Fetch a result // Continue the code that's waiting for us with the query result. $suspension->resume($result); return; } // Eat the error if the repeat was cancelled. // This could happen if we cancel the request and no longer need the result for example. // Until: https://github.com/revoltphp/event-loop/issues/91. // Otherwise since we're not done we try to poll again in the next callback. try { EventLoop::enable($callbackId); } catch (EventLoop\InvalidCallbackError $e) {} }); // Wait for the result to have been fetched from the database. $result = $suspension->suspend(); EventLoop::cancel($callbackId); return $result; }
If you're dealing with Revolt primitives directly then you'll have to think about the async states so that your calling code doesn't have to. For contrib there's the options of pulling in a lower level library of their choice (e.g. ReactPHP, amPHP or something new) to do this for them.
For the above snippet I modified one of my examples of the Revolt playground which attempts to demonstrate some of the scenarios you might currently find in Drupal (using Fibers) or other scenarios that have been discussed that we might need: https://github.com/Kingdutch/revolt-playground/