Investigate performance of block_content_post_update_set_owner

Created on 5 March 2024, 11 months ago
Updated 8 March 2024, 11 months ago

Problem/Motivation

On a site with 18648 blocks, this post-update hook takes around 30 minutes to complete. For some customers, 30 minutes of maintenance mode can be highly disruptive to business, requiring the pausing of hundreds of marketing campaigns spend, disabling machine learning of third party systems, etc...

There may be room for improvement here to help alleviate the service disruption window for users upgrading.

Steps to reproduce

Install an off-the-shelf standard install with minimum fiddling.
Load the site up with, say, 20,000 blocks via Devel.
Run the update hook and wait.

Proposed resolution

TBD

Remaining tasks

TBD

User interface changes

None

API changes

None

Data model changes

None

Release notes snippet

πŸ“Œ Task
Status

Closed: outdated

Version

11.0 πŸ”₯

Component
Block contentΒ  β†’

Last updated 8 days ago

Created by

πŸ‡ΊπŸ‡ΈUnited States luke.leber Pennsylvania

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @luke.leber
  • First commit to issue fork.
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10
  • Pipeline finished with Failed
    11 months ago
    Total: 532s
    #112210
  • Pipeline finished with Success
    11 months ago
    Total: 520s
    #112225
  • Status changed to Needs review 11 months ago
  • πŸ‡¬πŸ‡§United Kingdom catch
  • Status changed to RTBC 11 months ago
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    I originally had a direct query running (with the min expression) but it was weirdly buggy and thankfully caught by the test coverage I added due to swapping back to a lower uid. https://git.drupalcode.org/project/drupal/-/merge_requests/5500#note_234004

    I'm not sure why I didn't think of simply using a sort + limit of 1 but this makes a lot more sense.

    Given that tests are passing I see this as an easy win.

  • πŸ‡ΊπŸ‡ΈUnited States luke.leber Pennsylvania

    Strangely enough with the MR changes applied I get:

    >  [notice] Update started: block_content_update_10301
    >  [notice] Installed uid field for BlockContent entities.
    >  [notice] Update completed: block_content_update_10301
    >  [notice] Update started: block_content_post_update_set_owner
    >  [error]  SQLSTATE[42000]: Syntax error or access violation: 1140 In aggregated query without GROUP BY, expression #1 of SELECT list contains nonaggregated column 'drupal.bcr.revision_user'; this is incompatible with sql_mode=only_full_group_by: SELECT "bcr"."revision_user" AS "uid", MIN(revision_id) AS "revision_id"
    > FROM
    > "block_content_revision" "bcr"
    > WHERE "id" = :db_condition_placeholder_0; Array
    > (
    >     [:db_condition_placeholder_0] => 21
    > )
    >  
    >  [error]  Update failed: block_content_post_update_set_owner 
     [error]  Update aborted by: block_content_post_update_set_owner 
     [error]  Finished performing updates. 
    
    real    0m11.731s
    user    0m6.789s
    sys     0m0.518s
    
    

    Just using the off-the-shelf mysql configuration for Ubuntu 22.04; I've not ran this in a hosting provider environment.

  • πŸ‡¦πŸ‡ΊAustralia acbramley

    @Luke.Leber the MR removes the MIN expression. Are you sure you're using the latest changes and it applied correctly?

  • πŸ‡ΊπŸ‡ΈUnited States luke.leber Pennsylvania

    I have some metrics. I don't know how meaningful they are, but...

    The initial run that spurred me into making this issue:

    real    28m32.144s
    user    9m47.665s
    sys     0m55.545s
    

    Now that I've ran this a couple more times, I'm assuming that this long run can be attributed to Microsoft Defender running on the Linux environment that was running the update. (Don't ask me to explain why it's a good idea for Microsoft Defender to run on Linux environments -- ask the PSU security office...)

    More closely monitored runs -- where MS Defender was not engaged:

    Before the patch:

    real    14m4.685s
    user    5m2.253s
    sys     0m11.007s
    

    After the MR is applied:

    real    16m12.257s
    user    6m41.996s
    sys     0m14.335s
    

    I'm not sure that the optimization has a meaningful impact based on these numbers. Performance testing is pretty tough to nail down given the sample size and my ability / time to run upgrades against the environment.

  • Status changed to Needs work 11 months ago
  • πŸ‡ΊπŸ‡ΈUnited States luke.leber Pennsylvania

    If (that's a big if) I have time tomorrow, I'll try to run this with xdebug profiling to help visualize what's taking the most time.

    As somewhat of a side-note, I've found that the Entity API is not very well suited for crunching tons of data. I've found that using the database API for updates has been one of the most meaningful impacts on performance. Perhaps we can use the database API for setting the owner versus calling $blockContent->setOwnerId('blah')->setSyncing(TRUE)->save() to trim minutes off?

    Setting back to NW based on the (flaky?) real world test runs.

  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Testing on our project with 3.5k blocks (35k revisions)

    Before:

    > time drush updb -y
    ....
    >  [notice] Processed Block Content Entities (3590/3590)
    >  [notice] Update completed: block_content_post_update_set_owner
     [success] Finished performing updates.
    
    real	5m12.711s
    user	1m27.592s
    sys	0m6.600s
    

    After:

    > time drush updb -y
    ....
    >  [notice] Update completed: block_content_post_update_set_owner
     [success] Finished performing updates.
    
    real	4m54.783s
    user	1m18.913s
    sys	0m5.651s
    

    So yeah, very little improvement on that. What @Luke.Leber has said is correct - to get meaningful performance improvments with these types of updates you really have to bypass ->save() entirely.

    On our project we've started heavily utilising direct db updates/inserts for content migrations. E.g we could do something like this:

    $tableMapping = $blockContentStorage->getTableMapping();
          foreach ($tableMapping->getAllFieldTableNames($fieldName) as $tableName) {
            $database->update($tableName)
              ->fields(['uid' => $uid])
              ->condition('id', $blockContent->id())
              ->execute();
          }
    

    With that we could go further and not even load the blocks to begin with. I'll push something up shortly to demonstrate.

  • Status changed to Needs review 11 months ago
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Testing with the latest changes:

    >  [notice] Processed Block Content Entities (3600/3590)
     [success] Finished performing updates.
    
    real	0m15.351s
    user	0m6.604s
    sys	0m0.914s
    
  • Pipeline finished with Failed
    11 months ago
    Total: 564s
    #112394
  • Status changed to Needs work 11 months ago
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Obviously a bug there

  • Status changed to Needs review 11 months ago
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Bug was in the array_slice, should've been splice!

    New timing, I assume the last was so fast because it was updating the same rows over and over.

    real	1m5.031s
    user	0m7.018s
    sys	0m1.013s
    
  • Pipeline finished with Success
    11 months ago
    Total: 832s
    #112411
  • πŸ‡¬πŸ‡§United Kingdom catch

    Direct db update makes sense here. Would be good to know if/how this affects the contrib mongodb driver. Kicked off sqlite and postgres tests. Ignore the 8.3 ones, we have 8.3-specific test failures.

  • πŸ‡¬πŸ‡§United Kingdom longwave UK

    For things like this I wonder if the database driver can hint whether we can do raw SQL updates, and we have both paths in the update hook, falling back to entity save in non-SQL cases?

  • πŸ‡¬πŸ‡§United Kingdom catch

    Could we just check if the entity storage is SqlContentEntityStorage?

    If it's not, then we'd also need to do the select queries with pure entity queries too (which should be doable with a small adjustment to https://git.drupalcode.org/project/drupal/-/merge_requests/6929/diffs?co...).

  • πŸ‡·πŸ‡΄Romania amateescu

    @acbramley, could you run those tests with the MR from πŸ› Return early in \Drupal\layout_builder\InlineBlockEntityOperations::handlePreSave if the entity is syncing Needs review instead of this one?

  • πŸ‡³πŸ‡±Netherlands daffie

    For things like this I wonder if the database driver can hint whether we can do raw SQL updates, and we have both paths in the update hook, falling back to entity save in non-SQL cases?

    Sounds like a good idea to me. Raw queries are great for performance, only you need to know exactly how the entity data is stored in the database. Creating a fallback for non-sql databases is a great idea. It will be needed after we have a non-sql database driver like the one for MongoDB. It is not in core yet and a fallback will only needed for changes after there are sites running on MongoDB. For now, yust not needed. The database driver for MongoDB can mark the update test as skipped.

    Could we just check if the entity storage is SqlContentEntityStorage?

    Sorry, but no. The database driver for MongoDB uses SqlContentEntityStorage for its entity storage.

  • Status changed to Needs work 11 months ago
  • The Needs Review Queue Bot β†’ tested this issue. It no longer applies to Drupal core. Therefore, this issue status is now "Needs work".

    This does not mean that the patch necessarily needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.

    Consult the Drupal Contributor Guide β†’ to find step-by-step guides for working with issues.

  • πŸ‡¦πŸ‡ΊAustralia acbramley

    ✨ Add owner to the BlockContent entity type RTBC has been reverted so I guess we move these changes/discussions over there and close this off?

    Re #18 - no there was basically no change in times after applying that patch.

  • Status changed to Closed: outdated 11 months ago
Production build 0.71.5 2024