BatchQueue::claimItem() can get stuck with huge data in batch queue

Created on 2 April 2020, about 5 years ago
Updated 23 January 2025, 2 months ago

TLDR

BatchQueue::claimItem() has a query which can cause memory issues in the database engine, if the queue table contains huge chunks of data for a single batch.

This can lead to the query getting stuck.

The visible effect is that the batch operation gets stuck / does not start.

Background / use case

On a project I am working on, I encountered a case where a query in BatchQueue::claimItem() got stuck, because another module (search_api_et) added unreasonably large chunks of data to the 'queue' table. See #3124298: Huge data size in batch queue items β†’ .

The first lesson of course is to fix that contrib module.

But I also found a possible (memory) optimization to the query in core, so will scale much better to huge batches. See #985184-6: Poor performance of batch API with large queues β†’ ff.

Technical details

The query looks like this:

class BatchQueue extends SystemQueue {
  [..]
  public function claimItem($lease_time = 0) {
    $item = db_query_range('SELECT data, item_id FROM {queue} q WHERE name = :name ORDER BY item_id ASC', 0, 1, array(':name' => $this->name))->fetchObject();

In our case, the 'queue' table contained around 500 entries for a single batch name, and the 'data' column was quite big for each one.

Internally, it seems MySQL does the following:

  • Build the full list of 500 results with data + item_id. This can get very big if the 'data' column has long values.
  • Sort the full list of items with data + item_id.
  • Limit the list to the first item.
  • Return the first item.

It is possible to run this query directly in sqlc or in devel/php, without running a batch, given that:

  • the data is still present in the table.
  • you specify the correct 'name' parameter which actually has this huge result set.
  • the query contains an ORDER BY. Without that, it will not get stuck.

Symptoms:

  • If the data is sufficiently large, the query gets stuck.
  • If the data is not too large, the query is quite fast.
  • I have not found a case where the query is simply just slow but eventually finishes. This suggests that this is a memory issue inside the database engine, NOT a performance issue.

Solution

The solution is quite simple:

  • First fetch only the item_id, so that the temporary list inside MySQL will be smaller.
  • Use a second query to fetch the data for the given item_id.
πŸ“Œ Task
Status

Needs review

Version

7.0 ⚰️

Component

batch system

Created by

πŸ‡©πŸ‡ͺGermany donquixote

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States douggreen Winchester, VA

    I came to the similar solution on my own, but I think you want to update the expire time before you grab the data.

  • πŸ‡ΊπŸ‡ΈUnited States douggreen Winchester, VA

    And modern Drupal has the same problem, see DatabaseQueue.

    My summary of what is happening. Let's say that you have 10,000 items in your queue, and the serialized data for each is 5MB (our actual use-case), two processes are trying to claim the queue item at the same time, each processes query takes multiple seconds (up to 30 seconds), to get the same item and data, but only one of them will then succeed to set the expire and lock the item, thus the second process just wasted 30 seconds trying to get a lock.

    We should instead, just get the item, set the expire, and then get the data.

  • πŸ‡ΊπŸ‡ΈUnited States douggreen Winchester, VA
  • πŸ‡³πŸ‡ΏNew Zealand quietone

    Changes are made on on 11.x (our main development branch) first, and are then back ported as needed according to the Core change policies β†’ .

    The technical details in the summary contains Drupal 7 code and the patch is for Drupal 7. Tagging for an issue summary update

Production build 0.71.5 2024