Optimize for sites with large number of entities.

Created on 14 November 2024, about 1 month ago

Problem/Motivation

I opened up another ticket to support our custom entities, but I don't think this module will work for us because of the sheer number of entities that we have. There are something like 150k of our entity that we need to prune revisions from, so it's just totally infeasible to loop over every one of them looking for revisions during cron. We have a 100% chance of hitting a timeout.

Steps to reproduce

Create a large volume of entities with revisions and try to clean them up.

Proposed resolution

A more optimized way to handle the cleanup would be to create a queue for all that need to be cleaned, instead of a queue of revisions to be deleted.

At the beginning you could do a one-time cleanup of all entities by adding them all to the queue. After that, instead of adding revisions to the queue during cron, you could use something like hook_entity_update() (or another hook if there's something for revisions) to see when a revision is created, and then add the entity into the queue for its revisions to be processed.

Remaining tasks

Update the queue architecture.

User interface changes

None

API changes

None

Data model changes

Change the queue to process entities instead of arrays of revisions.

✨ Feature request
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡ΊπŸ‡ΈUnited States mrweiner

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @mrweiner
  • πŸ‡ΊπŸ‡ΈUnited States mrweiner

    I'm trying to decide whether we need to roll our own solution to this, or if we can contribute a patch here. Will post an update if I have one.

  • πŸ‡ΊπŸ‡ΈUnited States mrweiner

    It turns out that revisions were on for our entity, but we never actually used them. As such we have no revisions to actually clean up and don't need this module. That said, I still think this would be a good approach to optimizing the performance if anybody is interested in tackling that.

Production build 0.71.5 2024