Batch to update entity usage statistics very slow

Created on 28 October 2021, over 2 years ago
Updated 19 June 2023, about 1 year ago

Problem/Motivation

I have a big site with more than 57 000 nodes and many paragraphs (maybe 5 more than nodes). I'd like to see where my medias was used but when I add this module, I need to use the batch to update the entity usage statistics in /admin/config/entity-usage/batch-update

It's OK but I just passed 2h only for 9000 nodes.
Batch performences can't be upgraded ?

Steps to reproduce

Create a new site, create 60 000 nodes and 2/3 paragrpahes per node, using some medias.
Configure the module to track all usages in nodes and paragraphs.
Use the batch to update statistics.
Wait some hours or days.

Proposed resolution

I think it can be pssible to improve the batch performences, but I'm not a specialist, somebody know if it's possible ?

Thank you for your help.

πŸ’¬ Support request
Status

Active

Component

Code

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡¨πŸ‡¦Canada joseph.olstad

    Our varnish cached and purge optimized production drupal site became unresponsive and ran out of resources for 20 minutes at a moment that co-incided with an entity_usage query that showed up in our slow query log, 22 million rows_examined by the query (not sure why so many) the slow query took over 5 minutes to execute on a persona /mysql compatible db hosted on the acquia platform.

    The query originates from the entity_usage module.

  • πŸ‡¨πŸ‡¦Canada joseph.olstad

    These two related issues won't help for bulk processing. Rather than use joins we should consider making a patch to write a replacement query that uses sub queries intead of joins.

    Sub queries are much more efficient than joins.

    With that said, it's maybe going to be tricky to do.

  • πŸ‡¨πŸ‡¦Canada joseph.olstad

    These two related issues won't help for bulk processing. Rather than use joins we should consider making a patch to write a replacement query that uses sub queries intead of joins.

    Sub queries are much more efficient than joins.

    With that said, it's maybe going to be tricky to do.

Production build 0.69.0 2024