getParagraphWebformsRecursive causes slow load times

Created on 19 March 2021, over 4 years ago
Updated 20 April 2023, over 2 years ago

Problem/Motivation

Some of the entities we use with our website have paragraph fields containing a lot of data with some child paragraphs as well. getParagraphWebformsRecursive tries to go through all of them, causing extreme slow down (it can take up to a minute to load if the entity has a lot of data). This entity does not have any webforms in any paragraphs.

Steps to reproduce

Have entities with paragraph fields with a lot of data and child paragraphs that have their own paragraph fields. View the page. On Drupal 8.9.13 and uses Webform 8.x-5.25 and Paragraphs 8.x-1.12.

Proposed resolution

- Add a hook in order that the entity can skip getParagraphWebformsRecursive
- Try to check if the entity has any paragraphs with webforms before doing a recursive search
- Add UI to select the content types that uses getParagraphWebformsRecursive

πŸ› Bug report
Status

Needs review

Version

6.1

Component

Code

Created by

πŸ‡¦πŸ‡ΊAustralia heinvdb

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 5.7
    last update over 2 years ago
    537 pass
  • πŸ‡ΊπŸ‡ΈUnited States angrytoast PNW

    Thanks for the patch. We have a site that uses paragraphs heavily and are seeing performance issues in this area.

    I looked over the changes and it seems like there are a couple of improvements that can be made, how do these sound?

    1. In ::getFieldNames, instead of caching by the {{entity_type}}-{{entity_id}}, cache by {{entity_type}}-{{bundle}}. Especially for paragraphs, this accounts for multiple instances of a paragraph bundle on a page which is more likely given the reuse nature. As-is, the class cache var $fieldNames ends up looking like:

    node-1111 => fields
    paragraph-1001 => fields
    paragraph-2002 => fields
    paragraph-3003 => fields
    ... etc
    

    Instead it can be:

    node-bundle => fields
    paragraph-bundle1 => fields
    paragraph-bundle2 => fields
    ... etc
    

    The latter will likely have a better hit ratio for paragraphs.

    2. Cache field names returned in ::getParagraphFieldNames by entity type.
    This one gets called often with lots of paragraphs. We're actually seeing a lot of time spent here in the field_storage_config look up for paragraph field names. Since the field names are fixed per entity type and unlikely to change, thinking this is useful to cache both at the request level but also in Drupal cache for future requests since these aren't likely to change, and skipping the field_storage_config load helps across multiple page requests.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 5.7
    last update about 2 years ago
    537 pass
  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 9.5.x + Environment: PHP 8.1 & MySQL 5.7
    last update about 2 years ago
    537 pass
  • πŸ‡ΊπŸ‡ΈUnited States angrytoast PNW

    Add the patch changes in #16 to the existing MR with a minor change. Modified the field_storage_config<code> entity query lookup in <code>:: getParagraphFieldNames to target paragraph fields and get fields for all entity types at once instead of doing it gradually per type.

    https://git.drupalcode.org/project/webform/-/merge_requests/122/diffs?co...

  • Status changed to Needs work over 1 year ago
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Just stumbled over this as well in profiling, makes up about 10% of the time on the one example page I looked at.

    The cache as it is does not use cache tags, so it doesn't account for field changes. One option is the use the entity_field_info cache tag to handle that.

    That said, what my plan was before I found this issue is to use the field storage definitions from the entity field manager and then check the type and target type setting in PHP, I think that should have minimal overhead as those definitions are very likely then already loaded at that point and a loop over that should be pretty fast.

    I'm uncertain if the extra cache on getWebforms() is worth it, do we have any data on that? _Maybe_ if the fields aren't initialized yet, but cache request aren't free.

    The constructor BC fallback is also not using the same bin as the service definition.

  • Status changed to Needs review over 1 year ago
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Pushed an alternative suggestion as a new MR.

    We already have the entity object and its definitions, we can loop over them, just like getFieldNames(), there is no need to query/load field definitions separately.

    I also did include the static cache change for field names, but not the persistent cache, I'm pretty certain that a cache get is more expensive than a loop over field definition objects. I'm not even certain that the static method is worth it, quite possibly not.

    Locally in profiling, this method and its whole call chain no longer appears in blackfire which this change, but it's also worth pointing out that it was _way_ faster locally than the blackfire reports from production in the first place.

    I also changed the method to a generator but that could easily be changed back if that's deemed too much of a change.

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    536 pass
  • Pipeline finished with Failed
    over 1 year ago
    Total: 2033s
    #95264
  • πŸ‡ΊπŸ‡ΈUnited States jrockowitz Brooklyn, NY
  • πŸ‡ΊπŸ‡ΈUnited States jrockowitz Brooklyn, NY

    @Berdir I am very open to your approach. I hope other people can weigh in.

  • πŸ‡ΊπŸ‡ΈUnited States angrytoast PNW

    +1 for @Berdir's approach in #19. Since the entity field definitions are already statically and persistently cached, it's a much simpler way of getting at the necessary field names and removing the repeated + slow field_storage_config loadByProperties call.

    With this, the static cache in WebformEntityReferenceManager::getFieldNames doesn't matter as much. Basic local benchmark (our own site, not a standard Drupal install) on a larger page consisting of 68 paragraph references, 5 unique bundles, shows on average ~0.0008s faster with the static cache. I think it's not a meaningful difference and can be removed to reduce code.

    Thoughts?

  • Open in Jenkins β†’ Open on Drupal.org β†’
    Core: 10.1.4 + Environment: PHP 8.2 & MySQL 8
    last update over 1 year ago
    536 pass
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Thanks for the review. Fixed the check and removed the static cache.

  • Pipeline finished with Success
    over 1 year ago
    Total: 2262s
    #145574
  • Status changed to RTBC about 11 hours ago
  • πŸ‡¬πŸ‡§United Kingdom catch

    Just found this profiling a site too - was taking about 600ms, the change reduces it down to a couple of milliseconds so looks great to me.

    Also opened πŸ› Node test form access check runs for anonymous users Active which will have much less impact if this was committed, but was also the only caller of this method for anonymous users for me and can't see they'd ever need to access that route.

  • πŸ‡¨πŸ‡¦Canada Liam Morland Ontario, CA πŸ‡¨πŸ‡¦

    We will first need this on 6.3.x.

  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    Retargeted and rebased through the UI.

  • Pipeline finished with Canceled
    about 8 hours ago
    #553379
  • Pipeline finished with Failed
    about 8 hours ago
    #553380
  • πŸ‡¨πŸ‡­Switzerland berdir Switzerland

    berdir β†’ changed the visibility of the branch 3204456-paragraphs-slow to hidden.

Production build 0.71.5 2024