Delete items from field data and field revision tables should be done in bulk

Created on 17 June 2024, 10 months ago

Problem/Motivation

Deleting 100M content entities is slow.

Steps to reproduce

1. Create 100M content entities on an entity with 100 fields
2. Delete all the content entities
3. Look at query log and see unoptimized deletion:

Proposed resolution

In SqlContentEntityStorage::doDeleteFieldItems we delete data from the base and revision tables in bulk, but we call ::deleteBulkFromDedicatedTables one at a time:

    if ($this->revisionDataTable) {
      $this->database->delete($this->revisionDataTable)
        ->condition($this->idKey, $ids, 'IN')
        ->execute();
    }

    foreach ($entities as $entity) {
      $this->deleteFromDedicatedTables($entity);
    }

That results in a ton of separate SQL queries that could be grouped into one.

It seems pretty straightforward to do, as deleteFromDedicatedTables() is specific to SqlContentEntityStorage.

Making this change resulted in the duration going from several days to hours.

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

πŸ› Bug report
Status

Active

Version

10.2 ✨

Component
EntityΒ  β†’

Last updated 1 day ago

Created by

πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @djdevin
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia
  • Status changed to Needs review 10 months ago
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia
  • Pipeline finished with Success
    10 months ago
    Total: 512s
    #201249
  • Status changed to Needs work 10 months ago
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    MR should be against 11.x as the latest development branch

    Feels like a change that needs some kind of test coverage also.

  • Pipeline finished with Canceled
    10 months ago
    Total: 94s
    #202092
  • Pipeline finished with Failed
    10 months ago
    Total: 514s
    #202096
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia

    Rebased against 11.x (applies to 10.3.x cleanly as well)

    deleteFromDedicatedTables() doesn't have any test coverage - since this isn't a bug I wonder if having coverage through SqlContentEntityStorageSchemaTest is enough?

  • First commit to issue fork.
  • Pipeline finished with Success
    3 months ago
    Total: 336s
    #415665
  • πŸ‡ͺπŸ‡ΈSpain vidorado LogroΓ±o (La Rioja)

    I've added a kernel test to ensure that a single delete query is executed against both the data table and the revision data table.

    I'm not sure if directly testing the protected deleteFromDedicatedTables() method is the best approach, for the sake of simplicity, or if I should have mocked even more components to test it only through the public interface of SqlContentEntityStorage.

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Left comments on MR. Thanks!

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    Any chance you can compare deleting that many entities with and without this change?

    Also memory usage would be helpful.

    Also 100M, is that 100 million?

  • πŸ‡ͺπŸ‡ΈSpain vidorado LogroΓ±o (La Rioja)

    Thanks for the review, @smustgrave! I've applied your suggestions.

    @nicxvan: I haven't performed the test suggested in the IS, but I believe it refers to 100 million. In my opinion, that’s an extremely unlikely case. In most scenarios, there would be only a few bulk deletions, and the gain would be minimal, but I still think it’s worth doing things the right way when the fix is so simple.

  • Pipeline finished with Success
    2 months ago
    Total: 640s
    #427917
  • πŸ‡ΊπŸ‡ΈUnited States djdevin Philadelphia

    It was part of a multifaceted approach so I'm not sure exactly how much this was part of the gained performance. I used a Drupal queue and ran multiple instances of queue-process to delete a batch of 1000 entities at a time in parallel. A custom progress bar running periodically looked at the number of items at the start and the number of items left in the queue.

    Didn't have significant issues with memory, I believe it was solely the efficiency loss from running thousands of delete queries instead of 1.

    I was exaggerating a little bit and don't have hard benchmarks, but at 10M (million) it went from ~200 hours to 6 hours or so. They were also entities with ~200 fields (paragraphs) so there were a ton of single field delete queries which is where I identified the issue. So deleting a single entity cost 402 queries. Deleting 1000 of them cost 402,000 (2000 entity, revision, and 2*200*1000 field queries).

    With the fix, deleting 1000 entities cost 2400 (1000*2 entity, revision and 200*2 fields and revisionqueries)

    At 402,000 queries per batch of 1000, the issue is exacerbated if you are using something like AWS Aurora or MySQL with replication in a durable ACID configuration, and if the database isn't local to the network. Both contribute to delay when writing tons of data.

    tl;dr: Bulk queries are recommended anyway.

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    nicxvan β†’ changed the visibility of the branch 11.x to hidden.

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    ~200 hours to 6 hours or so

    wow!

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    I did a quick pass all a couple of comments.

    One is addressed, one has not.

    Test looks good and test only fails.

    Needs work for the last comment.

  • πŸ‡ͺπŸ‡ΈSpain vidorado LogroΓ±o (La Rioja)

    @nicxvan, GitLab may have misleaded you (and me too, for a moment). In the MR overview it doesn't say that the code was changed, but it indeed was! :)

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Believe all feedback on this one has been addressed.

  • πŸ‡¬πŸ‡§United Kingdom catch

    Overall this looks great and I couldn't find a single override or call to the method in contrib, however just in case we should add a change record here since it will be a hard break if someone is overriding or calling this method in custom code.

  • πŸ‡ͺπŸ‡ΈSpain vidorado LogroΓ±o (La Rioja)

    Done! :)

  • πŸ‡ΊπŸ‡ΈUnited States nicxvan

    I updated the CR for clarity, I agree this is RTBC again.

    • catch β†’ committed 3dd9a550 on 11.x
      Issue #3455151 by djdevin, vidorado, nicxvan, smustgrave: Delete items...
  • πŸ‡¬πŸ‡§United Kingdom catch

    Committed/pushed to 11.x, thanks!

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024