Add a garbage collector to clean up the queue for failed sitemap batches

Created on 8 February 2023, over 1 year ago
Updated 11 April 2024, 2 months ago

Problem/Motivation

The problem is, I'm working on a website with more than 10k items to be indexed, and the items are all reindexed every day, after some point, the sitemap batch stopped to work properly, due to a custom module that we have here, the main problem was, it took some days to realize this kind of behavior, and after taking a look at the queue table, we noticed the table with more than 100k items, the sitemap was taking a long time to run. Anyway, after truncating the table the slowness problem was solved.

The idea of this issue is to implement a garbage collector for failed items. Drupal already provides this kind of stuff, if you take a look at the DatabaseQueue class , you'll find a garbage collector method, the problem is, this method expects a drupal_batch: pattern on the name of the queue item, but sitemap uses simple_sitemap_elements.

Proposed resolution

My idea here is to update the hook cron and add the properly clean up here.

We've could also add a config to configure how much do we want to keep the register on the table.

πŸ“Œ Task
Status

Postponed: needs info

Version

4.0

Component

Code

Created by

πŸ‡§πŸ‡·Brazil murilohp

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @murilohp
  • Status changed to Needs review over 1 year ago
  • πŸ‡§πŸ‡·Brazil murilohp

    Here's a patch with the hook cron updated, I'm moving to NR to see what you guys think.

  • Status changed to Postponed: needs info over 1 year ago
  • πŸ‡©πŸ‡ͺGermany gbyte Berlin

    That's interesting; thanks for the patch! Now I feel I have to check the batch table in my projects. What do other contrib modules do - build their batch garbage collectors like you did, or prefix their batch entries with 'drupal_batch:'?

  • πŸ‡³πŸ‡±Netherlands jaapjan

    Also using this patch on my project because the queue is full of simple_sitemap_elements. Any plans on implementing a solution in the module?

Production build 0.69.0 2024