- Issue created by @kingdutch
- π³π±Netherlands kingdutch
I've created a branch to start work onto which I've cherry-picked the proposed changes for the issues that are blocking this. Those commits will need to be removed in a rebase when other issues land.
The proposal for the BigPipe rewrite is in https://git.drupalcode.org/issue/drupal-3425212/-/commit/cd1c3fb87507dc5... which only looks at the BigPipe implementation and does not yet do anything for the tests themselves which contain some Fiber specific code that will also need adapting.
- Merge request !6977Draft: Issue #3425212: Migrate BigPipe implementation to use Revolt Event Loop β (Open) created by kingdutch
- Status changed to Postponed
8 months ago 10:16pm 8 March 2024 - π³π±Netherlands kingdutch
Merging this is postponed on the two other issues landing but the proposed MR includes the code of the other two issues to show that altogether all the test coverage that exists is green and good to go.
As stated in the Revolt documentation "every application making use of cooperative multitasking needs a single scheduler (also called event loop), which this package provides." which is also backed up by Node.js's event loop implementation or the different native PHP event loop implementations that Revolt delegates to for us.
While working on converting the BigPipe mini-event loop into a set of tasks on Revolt's event loop this necessity was discovered in that the mini-event loop in the Renderer caused tasks to be resumed in placeholder tasks on the renderer. To fix this we'll have to convert both at the same time and I've adjusted this issue accordingly.
There are still Fiber references in the Drupal code-base because they are for suspending on single-tasks and thus don't act as mini-event loops. This causes them to work with the Revolt event loop code. We probably want to convert those to Revolt's event loops primitives in the future, but I've left everything that wasn't required for the test coverage to function outside of the scope of this issue to reduce the review surface.
- π¨π¦Canada Charlie ChX Negyesi πCanada
First of all: feel free to dismiss my concern. My view on things are warped because I spend little time building from the UI, not that much writing new code -- most of my time is spent with debugging. So when I see this:
foreach ($operations as $key => $operation) { EventLoop::defer(function () use ($suspension, $key, $operation) { try { $result = $operation(); $suspension->resume([$key, Result::ok($result)]); } catch (\Throwable $e) { $suspension->resume([$key, Result::error($e)]); } }); }
this makes me afraid it'll make code simply undebuggable. I have been burnt by the GraphQL module before using similar patterns. You put a breakpoint somewhere, get a stacktrace and it's completely useless because it comes from this thing which merely executes something assembled by something else and the two thing are so far removed you can't stitch it together again. I very well recognize this is a pattern used in other programming languages but that is a theoretical reassurance that doesn't help debugging Drupal code.
- π³π±Netherlands kingdutch
Thanks for sharing your concern!
Without a specific example of the problematic GraphQL stack-trace it's hard to comment on the specific issue (perhaps sharing this in the GraphQL issue queue if you haven't done so already can help us find improvements there). For now I'm going to assume that the trouble you experience with GraphQL is specifically because we build the schema in one place and then execute against the schema later in another place. While doing that we use a user-land promise style which might complicate this a lot.
There's also the fact that Drupal's exception handler delegates to
Error::decodeException
which only handles the top level exception and truncates any nested exceptionsFor example the following code in a PHP sandbox
try { throw new \RuntimeException("Foo"); } catch (\Exception $e) { throw new \Exception("Exception thrown", 0, $e); }
Properly shows both stack-traces
Fatal error: Uncaught RuntimeException: Foo in /in/Q5PRm:4 Stack trace: #0 {main} Next Exception: Exception thrown in /in/Q5PRm:7 Stack trace: #0 {main} thrown in /in/Q5PRm on line 7 Process exited with code 255.
That same implementation in Drupal would only log the
\Exception
which is thrown in thecatch
block and doesn't actually contain the useful information. Ensuring that Drupal prints the entire chain of exceptions would be a big developer experience win and might solve (some of) your debugging issues.With those two things in mind there's two changes for the implementation that we're discussing here.
Firstly I made sure not to wrap any caught exception (or exception contained in the Error result) so that Drupal doesn't hide the origin of the exception. Instead I throw them as-is.
Secondly the implementation we use here is built on Fibers which doesn't suffer the problem that GraphQL's promise-based implementation might have because PHP itself keeps track of where the work was postponed and can thus also restore the callstack. I created a small reproduction of what an exception would look like for the bigpipe code in my Revolt Playground. The resulting stacktrace is copied below and I've prepended numbers so I can annotate them below.
[1] at revolt-playground/src/Demo/Stacktrace/Command.php:43 [2] Kingdutch\RevoltPlayground\Demo\Stacktrace\Command->throwException() at revolt-playground/src/Demo/Stacktrace/Command.php:25 [3] Kingdutch\RevoltPlayground\Demo\Stacktrace\Command->Kingdutch\RevoltPlayground\Demo\Stacktrace\{closure}() at revolt-playground/Drupal/Core/Async/functions.php:41 [4] Drupal\Core\Async\{closure}() at revolt-playground/vendor/revolt/event-loop/src/EventLoop/Internal/AbstractDriver.php:597 [5] Revolt\EventLoop\Internal\AbstractDriver->Revolt\EventLoop\Internal\{closure}() at n/a:n/a [6] Fiber->resume() at revolt-playground/vendor/revolt/event-loop/src/EventLoop/Internal/AbstractDriver.php:497 [7] Revolt\EventLoop\Internal\AbstractDriver->invokeCallbacks() at revolt-playground/vendor/revolt/event-loop/src/EventLoop/Internal/AbstractDriver.php:553 [8] Revolt\EventLoop\Internal\AbstractDriver->Revolt\EventLoop\Internal\{closure}() at n/a:n/a [9] Fiber->resume() at revolt-playground/vendor/revolt/event-loop/src/EventLoop/Internal/AbstractDriver.php:94 [10] Revolt\EventLoop\Internal\AbstractDriver->Revolt\EventLoop\Internal\{closure}() at revolt-playground/vendor/revolt/event-loop/src/EventLoop/Internal/DriverSuspension.php:117 [11] Revolt\EventLoop\Internal\DriverSuspension->suspend() at revolt-playground/Drupal/Core/Async/functions.php:55 [12] Drupal\Core\Async\stream() at revolt-playground/src/Demo/Stacktrace/Command.php:28 [13] Kingdutch\RevoltPlayground\Demo\Stacktrace\Command->execute() at revolt-playground/vendor/symfony/console/Command/Command.php:279 [14] Symfony\Component\Console\Command\Command->run() at revolt-playground/vendor/symfony/console/Application.php:1031 [15] Symfony\Component\Console\Application->doRunCommand() at revolt-playground/vendor/symfony/console/Application.php:318 [16] Symfony\Component\Console\Application->doRun() at revolt-playground/vendor/symfony/console/Application.php:169 [17] Symfony\Component\Console\Application->run() at revolt-playground/index.php:21
1 and 2 is where the actual exception occurs.
3 is the indirection of wrapping our function call in a closure to be able to prepare it as a deferrable operation.
4-11 is Revolt internals which take care of the scheduling of the asynchronous operations.
12 is where we loop over the operations (thus invoking them) usingstream
13-17 is Symfony's application and console command handling. - π¬π§United Kingdom catch
Did a first pass review of the MR here. The BigPipe tests caught loads of bugs when I was working on the Fibers implementation, so them passing here makes me very confident this is all working as expected too. It definitely reduces a lot of duplication of the Fiber looping.
The Revolt API looks 'new' and it's a bit bigger than the raw PHP Fibers API, however having worked on the initial Fibers implementation, first me until I actually tried to use it, then loads of other people I talked to about it, got very confused about what Fiber::suspend() and Fiber::resume() etc. actually do, so having a more commonly used API with its own high level docs around some of those things will probably help.
Additionally, hardly anyone is going to be writing async code for Drupal. The actual async code we're likely to add in core will be in β¨ [PP-1] Async database query + fiber support Active and once that's added, Views should use it β¨ [PP-1] Add async query execution support to views Active , and that will mean it runs a lot but without anyone needing to write async queries themselves let alone a separate async implementation. There are other use cases (Guzzle http requests, sending e-mails etc.) but not that many because it needs to be an i/o heavy operation to be worthwhile.
- π³π±Netherlands kingdutch
I've rebased the code on top of 11.x and re-applied the changes from π Add revoltphp/event-loop dependency to core Active and β¨ Implement a Result type in Drupal Core Needs review so that their code is up-to-date according to the reviews from those issues.
By larowlan
I feel strongly either way about the use of namespaced functions. I would like to see some more documentation here of examples of what operations might be.
e.g. some async, some delayed, some deferred
I'm not entirely sure what examples to provide. Delayed and Deferred functions are examples of async functions. The semantics of what is "delayed" vs "deferred" are those matching with the Revolt Event Loop.
There is no hard requirement that an operation is actually asynchronous either. In case the operations themselves are actually synchronous then the form of
stream
(by scheduling withEventLoop::defer
) ensures that other operations can still run in between synchronous operations as needed. In case the operations themselves suspend then thanks to the event loop this automatically provides places for other work to occur. The important part forstream
is that we want to process tasks that are done as soon as possible, so that they may be out of order if a task started first takes longer.I created an example of this in kingdutch/revolt-playground github repository. One way to show what would happen for synchronous operations would be to extend that example with
placeHolderSync
that usessleep
for a certain period of time, rather than using theEventLoop::delay
function.I hope that example can help clarify of what kind of additional documentation you'd like to see :D
- π¨π¦Canada Charlie ChX Negyesi πCanada
EventLoop::defer(function () use ($suspension, $key, $operation) { try { $result = $operation(); $suspension->resume([$key, Result::ok($result)]); } catch (\Throwable $e) { $suspension->resume([$key, Result::error($e)]); } });
I really dislike this pattern. I complained in Slack this doesn't feel like PHP to me and later when digging around for the event dispatcher issue have I found https://www.drupal.org/project/drupal/issues/1972300#comment-9216241 β which expresses the exact same concern about the same pattern.
But whether it's typical to PHP or not, it's certainly alien to Drupal.
And we have a lot of examples of what happens when brilliant people (and Kingdutch is one without a doubt) introduce patterns familiar to them.
We see it in Views which was literally written by a wizard (hi Merlin :) ) and it is still a PHP4 codebase mostly because no one has the slightest idea any more what's going in there. If you disagree, please document what items are in MultiItemsFieldHandlerInterface , thanks :)
We see it in entity query where Tables.php is an eldritch abomination written by me (sigh) and the moment I stopped working on π Entity field relationship queries for multi column field items stored in shared tables are broken Needs work the issue died.
We see it in Typed Data where the number of bugfixes per year is below one (not counting the mechanical upkeep with latest upstream/Drupal). And that's not because it's bug free.
The moment Kingdutch stops working on this, this component will die too.
This is why I object.
- π¬π§United Kingdom catch
Worse, it builds on fibers which are, by themselves, a very new concept and not at all familiar to anyone.
Fibers was added to PHP 8.1, which was released in 2021, we're now three years past that.
PDO was added to PHP in 2004, the dbtng issue was committed in 2008, four years later.
The diffstat here is 400 lines, the dbtng patch was nearly 400kb. Every contrib module had to rewrite their database queries due to dbtng, exactly zero contrib modules would be required to change anything because of the change here.
It's true that there are some large parts of core that are either not well maintained, or are mostly only receiving bugfixes - but you could have said that about BigPipe until the Fibers patch landed too. Part of maintaining things is updating them to take advantage of new language features etc.
And the worst thing is, it won't be core that will go sideways, no it'll be when you have a large codebase five years down the line having thirty calls to stream() -- what then?
We're already using Fibers in core, but in literally two places, which are mutually exclusive on runtime - the renderer, and the BigPipe renderer that replaces it when you enable big_pipe module. There is one other place that we might add a fiber loop/stream call, which is in π Add a cache prewarm API and use it to distribute cache rebuids after cache clears / during stampedes Needs work .
I do not think any codebase will find thirty places to call
stream()
in fact even for complex code-bases, I would expect the number of places they would call it themselves would be between zero and 1.That's because by using the existing render placeholder system (and the cache prewarm fallback) as the entry point, there are few other opportunities to do so (and that's what makes the bigpipe implementation so encouraging from a cold cache performance, once we have async database query support to actually exploit it). What we will eventually do is find more places to call Fiber::suspend() or the revolt equivalent but that's also not going to be many.
The exception would be very low-level, cli-only processes - for example something like aggregator module refreshing feeds on cron, or maybe update status's XML parsing - but those are inherently self-contained operations that aren't happening during normal page rendering, so the arbitrary callable issue doesn't really come up there in the same way.