Add an 'instant' queue runner

Created on 15 June 2011, about 13 years ago
Updated 5 June 2024, 24 days ago

Problem/Motivation

Drupal has two ways to handle long-running tasks:

1. Queue API, by default queues are run on cron, often sites run queues via jenkins on different schedules, there are projects like 'waiting queue' which aim to minimise the time before queue items are picked up.

2. Batch API - designed for the browser, although drush can also run batches, completely blocks the user's interaction with the site while it's running. Batches are user-specific so if one gets interrupted, that's pretty much it. Provides feedback, you know when it's finished.

From personal experience, the Queue API is great to work with and expectations are very clear. Batch API is hard to work with and debug.

In the UI, Batch API always blocking isn't necessarily optimal. Bulk operations on entities for example could be put into a queue, and there could be a progress bar in a block somewhere which runs down the queue, but allows the user to click around the rest of the site while that's happening. There are cases though like updates from the UI where things really do need to block and require strict ordering.

Proposed resolution

In automated_cron module introduce the possibility to process queues instantly on the terminate event. Following two configurations will be supported
- maximum number of items. This is the maximum number of items per queue that will be processed instantly. If there are more items added to the queue then it will be processed on the next subsequent run of the queue or cron. Only the queues to which items were added in that session will be processed. If this value is set to zero then instant queue processing will be disabled.
- maximum concurrent processing: If multiple submits create queue items, then the maximum number of concurrent queue processing can be configured. This must ensure that the site is not overloaded with queue processing. This will be defaulted to 1.

Remaining tasks

User interface changes

API changes

Data model changes

Cron jobs fall into a couple of main categories:

- things that have to be run periodically and don't care about site events - mainly garbage collection like session and cache.
- batch processing of things - search indexing of new nodes, purging of deleted stuff.

For the latter case, these are increasingly moving to the queue API, although it's not 100% consistent in core.

Issues like #943772: field_delete_field() and others fail for inactive fields , and the one I can't find about indexing nodes for searching immediately, might be helped by a poor mans queue runner.

Drupal 7 has a poor mans cron. Currently the implementation is very basic - 1px gif/xhr requests were causing Drupal to be bootstrapped twice each request, and at one point there was a proposal to do ob_flush() during a shutdown function but this didn't take on, so we ended up just running cron inline instead, which is sucky but I argued for that in the end.

With the queue runner, it'd be more a case of setting $_SESSION['run_queues'] after a form submit, check that on the next page, if it's set, add the 1px gif or whatever to that page, which hits /queue_run with a token. This would only ever be triggered by form submissions so it'd not have the page caching issues of cron runs.

Things it could be useful for:

- field deletion
- mass deletes of other stuff
- operations that trigger menu rebuilds or similar expensive operations, that don't necessarily have to happen inline with the request - just very shortly afterwards.
- indexing nodes in the core search module immediately after posting instead of waiting for cron.

Feature request
Status

Needs review

Version

11.0 🔥

Component
Base 

Last updated 1 minute ago

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇧United Kingdom catch

    @andypost amp or reactphp might help on the cli, but this requires running a daemon waiting for queue items to come in, which isn't going to be an option on a lot of hosting environments. That's more what https://www.drupal.org/project/advancedqueue and similar are trying to do.

    What I am thinking about here is a module we can ship with core, similar to automated_cron, that will run some queues down in the browser. It is not as good as a waiting queue, but it would allow us to do things like add functionality like https://www.drupal.org/project/image_style_warmer to core.

    When we originally added automated cron to core, one of the ideas was to add a 1px gif following POST requests. This was rejected at the time because it was not guaranteed to run often enough on a site without much authenticated activity, but I think it would be enough here since we're explicitly hoping to execute queue items created by things like saving an entity form.

  • 🇬🇧United Kingdom catch

    After 🐛 Post-response task running (destructable services) are actually blocking; add test coverage and warn for common misconfiguration Fixed there's a possible path forwards here. I think we could add it to automated_cron

    1. Automated cron adds a decorator for the queue service, this keeps a record of any queue that has an item added during a request.

    2. That same decorator implements DestructableInterface and runs some (configurable via container parameter?) number of queue items from the list of queues it has at the end of the request - so could do do 1 item each from five queues or 5 items from one queue or similar.

    We cant guarantee that the queue items that were added will be the ones that were created during the request, but some queue items will be processed and anything that doesn't gets picked up by the next cron run.

    This would allow us to fix long-standing issues like #504012: Index content when created , that issue could continue to mark the node for reindex, but also add a queue item that will reindex it when it runs, then automated cron picks that queue item up and reindexes it at the end of the request.

    It would also open up the possibility to bring the functionality of https://www.drupal.org/project/image_style_warmer and similar modules into core - we just create the queue item and if you have automated cron it might handle it, but if also it would work if you have a dedicated permanent queue runner or frequent drush jobs just for running queues too.

  • 🇮🇳India sukr_s

    I'm proposing the following implementation in automated_cron module
    - Instant queue processor will have the following configurations
    - Min number of items to trigger the queue. Setting to zero will disable instant queue processing
    - Max number of concurrent queue processing
    - Implement a queue.database manager extending the core queue.database which will track the items being added to the queue. All queuing function will be delegated to the core implementation.
    - Add a new route with no access check that will process the queue
    - At end of a request if the number of queue items configured is reached then an async queue processing will be triggered using the mechanism in https://github.com/biznickman/PHP-Async. I've been using this approach and works quite well.

    This will not use any browser based implementation or 1x1 gifs.

    Any thoughts, suggestions, edits or objections? I'll try to implement the same shortly.

  • 🇬🇧United Kingdom catch

    @sukr_s when we originally implemented automated cron in core, we looked at making asynchronous http requests to a Drupal route, but eventually gave up on it because some hosting environments do not cleanly support making http requests to themselves. This is why automated_cron uses the terminate event, to run after the response has been sent to the browser but before script termination. The post response logic works a lot more consistently after 🐛 Post-response task running (destructable services) are actually blocking; add test coverage and warn for common misconfiguration Fixed .

    So I would just directly process the queue items in the terminate event listener without trying to deal with async http requests. This also avoid having to worry about an access-free route for processing the queue items, which we'd probably have to lock down with a token or similar.

    - Min number of items to trigger the queue. Setting to zero will disable instant queue processing
    - Max number of concurrent queue processing

    Why not a maximum number of queue items to process, and if there's one or more queue item added during the request, process them at the end? If the maximum number is 0, nothing ever gets processed. I don't see what a minimum number of items gets us.

  • 🇮🇳India sukr_s

    @catch

    So I would just directly process the queue items in the terminate event listener without trying to deal with async http requests

    With this suggestion, I'm assuming that processing in the event listener will be non-blocking, otherwise async call is better. Will go with your suggestion for now.

    Why not a maximum number of queue items to process,

    Do you mean not to process more than the maximum number of items in the queue at a time. I was thinking of setting the time limit to zero so that all the items in the queue will be processed. Otherwise in each terminate event we will have to check the remaining number of items in the queue and trigger again, which would mean additional db calls in the terminate event.

    and if there's one or more queue item added during the request, process them at the end?

    Yes that's the current thought as well that the process queue call would be done in the terminate event to avoid multiple calls in cases where multiple items are added to the queue.

    I don't see what a minimum number of items gets us.

    If there was a need to process in a batch instead of immediate processing, perhaps an unwanted frills.

  • 🇬🇧United Kingdom catch

    With this suggestion, I'm assuming that processing in the event listener will be non-blocking, otherwise async call is better. Will go with your suggestion for now.

    Yes it's non-blocking - the terminate event runs after the response is sent to the browser. Prior to Drupal 10.2-ish this only worked on certain server configurations but since the issue linked above it works on nearly all.

    Do you mean not to process more than the maximum number of items in the queue at a time. I was thinking of setting the time limit to zero so that all the items in the queue will be processed. Otherwise in each terminate event we will have to check the remaining number of items in the queue and trigger again, which would mean additional db calls in the terminate event.

    The queue could potentially have thousands of items in it, that could lead to OOM errors etc if we tried to process everything. The way I thought of this working was that we would track which queue and how many items are added during a request, probably in a class property of the decorator or similar.

    Say someone submits a node form, and that adds one queue item to two different queues, A and B, then in the terminate event, we'd process one item from queue A and one item from queue B. Obviously there's no guarantee that these are the same queue items at all, but this feature is mostly intended for lower traffic sites where that's more likely to be the case.

    However if submitting the node form added 500 items to the queue (e.g. it adds queue item for every person following an issue or something), we'd only process the configured maximum of items then stop (anything else would have to eventually be picked up by cron, which also processes queue items).

  • Merge request !8280instant queue support → (Open) created by sukr_s
  • Pipeline finished with Failed
    24 days ago
    Total: 176s
    #190631
  • Pipeline finished with Failed
    24 days ago
    Total: 510s
    #190648
  • Pipeline finished with Failed
    24 days ago
    Total: 579s
    #190683
  • Pipeline finished with Failed
    24 days ago
    Total: 528s
    #190700
  • Pipeline finished with Failed
    24 days ago
    Total: 557s
    #191411
  • Pipeline finished with Failed
    24 days ago
    Total: 559s
    #191435
  • Pipeline finished with Failed
    24 days ago
    Total: 490s
    #191449
  • Pipeline finished with Success
    24 days ago
    Total: 570s
    #191480
  • Status changed to Needs review 24 days ago
Production build 0.69.0 2024