\Drupal\block_content\Plugin\Derivative\BlockContent::getDerivativeDefinitions could use an aggregate query instead of loadMultiple

Created on 24 January 2024, about 1 year ago

Problem/Motivation

Similar issue as πŸ› block_content block derivatives do not scale to thousands of block_content entities Needs work but a different approach to a fix

Steps to reproduce

Have 1000s of reusable block content entities.
See your site die with OOM on plugin definition rebuild.

Proposed resolution

Use an aggregate query instead of ::loadMultiple

Merge request link

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

πŸ“Œ Task
Status

Active

Version

11.0 πŸ”₯

Component
Block contentΒ  β†’

Last updated 4 days ago

Created by

πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @larowlan
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10
  • Merge request !6298Use aggregate query β†’ (Open) created by larowlan
  • πŸ‡³πŸ‡ΏNew Zealand danielveza Brisbane, AU

    Tested this on a large site with over 3000+ block content entities where I was running into OOM issues.

    After saving a block and therefore having the Block definition caches cleared:

    No patch: ~152MB memory usage
    With patch: 76MB memory usage

  • Status changed to Needs work about 1 year ago
  • πŸ‡¬πŸ‡§United Kingdom catch

    This looks great! One point on the MR to add a code comment. I still think we should do πŸ› block_content block derivatives do not scale to thousands of block_content entities Needs work but this will stop a lot of sites crashing in the meantime.

  • Status changed to Needs review about 1 year ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    Added a comment

  • Status changed to RTBC about 1 year ago
  • πŸ‡¬πŸ‡§United Kingdom catch

    Looks great. We've got the other issue(s) open to move away from derivatives, I think we could backport this to 10.2 as well.

  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    What happens when you add translations? I suspect that a translation with a different info/label will result in a separate key, and then it gets unpredictable which one you actually end up with? Maybe add a default langcode condition?

    FWIW, this feels like a misuse of aggregate queries to me, but it is obviously a big improvement, so...

  • Status changed to Needs work about 1 year ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10
  • Status changed to Needs review about 1 year ago
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    FWIW, this feels like a misuse of aggregate queries to me, but it is obviously a big improvement, so...

    Can you elaborate on why, I do this fairly regularly in contrib and custom code. Trying to understand the implications.

    Maybe add a default langcode condition?

    Done

  • Pipeline finished with Failed
    about 1 year ago
    Total: 619s
    #83987
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    > Can you elaborate on why, I do this fairly regularly in contrib and custom code. Trying to understand the implications.

    We always say that the only correct way to interact with entities is to load them, so that the appropriate hooks are running which *might* alter the data or also things like language negotiation*. We do not support any partial loading of field values or direct DB queries and an aggregate query is essentially that. Usually an aggregate query is combined with an aggregation function, where you count all nodes grouped by type or some other field and don't access the raw values. This is different.

    *I just realized that this isn't working at all right now anyway, block content labels always show up in the default language only (we kind of implicitly assume in most plugin caches that all labels are t() objects and translated when accessed so we don't cache per language), that's not the case here. This is already not working in any useful way in HEAD, but I could imagine something like a ContentTranslation object that you feed with all the translations of a given content entity that then acts like a TranslatableMarkup object.

  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    We always say that the only correct way to interact with entities is to load them, so that the appropriate hooks are running which *might* alter the data or also things like language negotiation*

    Thanks, I was thinking that might be the reason, so its good to get confirmation.

    Does that mean that this approach is a BC break, because someone could be implementing a load hook for block content and modifying the titles? Do we need to put this behind a feature flag that's opt-in? Thoughts @catch with your RM hat on?

  • Pipeline finished with Failed
    about 1 year ago
    Total: 477s
    #84455
  • Pipeline finished with Success
    about 1 year ago
    Total: 564s
    #84467
  • πŸ‡¦πŸ‡ΊAustralia larowlan πŸ‡¦πŸ‡ΊπŸ.au GMT+10

    I'm not aware of how trash module works but I got a bug report for a contrib module that leads me to suspect it might unset entities in a load hook if they're trashed. That might be something that has a side effect here.

  • Status changed to Needs work about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Seems 10.2.3 is out. Should this go to 10.3?

  • Assigned to catch
  • πŸ‡§πŸ‡ͺBelgium wim leers Ghent πŸ‡§πŸ‡ͺπŸ‡ͺπŸ‡Ί

    Per @larowlan in #12.

  • πŸ‡¨πŸ‡¦Canada phjou Vancouver πŸ‡¨πŸ‡¦ πŸ‡ͺπŸ‡Ί

    Patch doesn't seem to apply anymore.

  • Pipeline finished with Failed
    11 months ago
    Total: 580s
    #166799
  • First commit to issue fork.
  • Pipeline finished with Failed
    5 months ago
    Total: 85s
    #330922
  • First commit to issue fork.
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Rebased and fixed the PHPCS issue, also updated the deprecation message and CR.

  • Pipeline finished with Failed
    3 months ago
    Total: 507s
    #396247
  • Pipeline finished with Failed
    3 months ago
    Total: 454s
    #396269
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Random failures seem to be biting us hard here. Will leave it for now and rerun later.

  • Pipeline finished with Success
    3 months ago
    #396276
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Indeed was random, re-running fixed.

    left 1 comment on MR.

  • πŸ‡¬πŸ‡§United Kingdom catch

    Replying to the bc question from... this time last year.

    The main case for block titles being altered would be previewing forward revisions, but workspaces will handle that when you're in a workspace via entity query rewrites.

    For trash module I think that it will rewrite entity queries to not include the entity when it's trashed - but we should ideally check with @amateescu.

    I think this is a sufficiently serious issue that we should not put the fix behind a feature flag but would be good to confirm the trash case first.

  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Thanks @catch, back to NR.

  • πŸ‡©πŸ‡ͺGermany solimanharkas Hamburg

    After debugging a Drupal website running version 10.4, we frequently encountered 502 errors due to an excessive number of reusable block content entities. However, enabling the MR seems to have resolved the issue. Thank you very much!"

  • πŸ‡΅πŸ‡±Poland dmitry.korhov Poland, Warsaw

    We found an issue with aggregated approach, when we add a new block into Layout of a page the block's plugin is considered as "broken/missing".
    The root cause is in query itself, it does work differently in comparison with default approach:

    count($this->blockContentStorage->loadByProperties(['reusable' => TRUE])); // 5932
    
    count($block_contents = $this->blockContentStorage->getAggregateQuery()
          ->condition('reusable', TRUE)
          ->condition('langcode', $this->languageManager->getDefaultLanguage()->getId())
          ->groupBy('uuid')
          ->groupBy('info')
          ->accessCheck(FALSE)
          ->groupBy('type')
          ->execute();
    ); // 6277
    
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Confirmed #25

    I was able to reproduce this with the following steps:

    1. Install Drupal on demo_umami
    2. Create a new block in Spanish, NOT English (do not translate to english either)
    3. Add the block to a page
    4. Checkout this branch
    5. Clear cache
    6. See broken/missing block.

    This only happens if the block's original language is not the site's default language (i.e English for demo_umami)

    I'll look at fixing this and adding test coverage.

  • Pipeline finished with Success
    about 1 month ago
    Total: 814s
    #430260
  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Fixed and added test coverage.

  • πŸ‡΅πŸ‡±Poland dmitry.korhov Poland, Warsaw

    Marking this ticket to "Needs work" again as there are still some issues for blocks that have few translations with different info (see previous comment to MR)

  • πŸ‡¦πŸ‡ΊAustralia acbramley

    Speaking internally with @larowlan this might not be viable for multilingual sites (@berdir pointed this out way back in #8 and #11)

    @larowlan suggested we could detect > 1 language and fallback to the current behavior so sites that aren't multilingual benefit from the huge performance boost.

    Regardless we'll need extra test coverage for the new issues mentioned above #28

  • πŸ‡΅πŸ‡±Poland dmitry.korhov Poland, Warsaw

    So, if aggregate query cannot be use, then let's try chunked loadMultiple?
    E.g. we have 1000 blocks and we do load them in memory 50 per iteration.
    Some iterators/generators stuff might help with memory utilization (see https://www.drupal.org/project/drupal/issues/2577417 πŸ“Œ Add an entity iterator to load entities in chunks Needs work )

Production build 0.71.5 2024