BatchQueue::claimItem() can get stuck with huge data in batch queue

Created on 2 April 2020, about 5 years ago
Updated 23 January 2025, 4 months ago

TLDR

BatchQueue::claimItem() has a query which can cause memory issues in the database engine, if the queue table contains huge chunks of data for a single batch.

This can lead to the query getting stuck.

The visible effect is that the batch operation gets stuck / does not start.

Background / use case

On a project I am working on, I encountered a case where a query in BatchQueue::claimItem() got stuck, because another module (search_api_et) added unreasonably large chunks of data to the 'queue' table. See #3124298: Huge data size in batch queue items β†’ .

The first lesson of course is to fix that contrib module.

But I also found a possible (memory) optimization to the query in core, so will scale much better to huge batches. See #985184-6: Poor performance of batch API with large queues β†’ ff.

Technical details

The query looks like this:

class BatchQueue extends SystemQueue {
  [..]
  public function claimItem($lease_time = 0) {
    $item = db_query_range('SELECT data, item_id FROM {queue} q WHERE name = :name ORDER BY item_id ASC', 0, 1, array(':name' => $this->name))->fetchObject();

In our case, the 'queue' table contained around 500 entries for a single batch name, and the 'data' column was quite big for each one.

Internally, it seems MySQL does the following:

  • Build the full list of 500 results with data + item_id. This can get very big if the 'data' column has long values.
  • Sort the full list of items with data + item_id.
  • Limit the list to the first item.
  • Return the first item.

It is possible to run this query directly in sqlc or in devel/php, without running a batch, given that:

  • the data is still present in the table.
  • you specify the correct 'name' parameter which actually has this huge result set.
  • the query contains an ORDER BY. Without that, it will not get stuck.

Symptoms:

  • If the data is sufficiently large, the query gets stuck.
  • If the data is not too large, the query is quite fast.
  • I have not found a case where the query is simply just slow but eventually finishes. This suggests that this is a memory issue inside the database engine, NOT a performance issue.

Solution

The solution is quite simple:

  • First fetch only the item_id, so that the temporary list inside MySQL will be smaller.
  • Use a second query to fetch the data for the given item_id.
πŸ“Œ Task
Status

Needs review

Version

7.0 ⚰️

Component

batch system

Created by

πŸ‡©πŸ‡ͺGermany donquixote

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Production build 0.71.5 2024