Limit number of entities displayed to increase performance on big sites

Created on 7 January 2025, 5 months ago

Problem/Motivation

  • On a large site with tens of thousands of nodes, deleting an entity with a lot of references may cause the request to time out.

Proposed resolution

  • Limit the number of entities displayed in the delete dialogue.
✨ Feature request
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡¨πŸ‡¦Canada andrew.wang

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @andrew.wang
  • Status changed to Needs review 14 days ago
  • πŸ‡¨πŸ‡¦Canada andrew.wang

    Rerolled for 2.0.0

  • πŸ‡ΊπŸ‡ΈUnited States m.stenta

    Thanks @andrew.wang! This makes sense to me. I'm going to re-categorize this as a bug report, because it is a potential OOM error.

    Currently the patch changes EntityReferenceIntegrityEntityHandler::referentialEntityQuery() so that it always returns a maximum of 11 entities. This solves the OOM issue in the entity deletion form, as described above, but it could cause issues elsewhere. For example, if someone has downstream code that relies on this method to get a list of all entities, that would no longer work, and it wouldn't be clear why it's only returning 11.

    The current automated test for this (in EntityReferenceIntegrityEntityHandlerTest::testGetDependentEntityIds()) would miss this because it only tests with a single dependent entity. We should probably expand those tests to look for multiple entities, to prevent issues like this from being introduced accidentally.

    My assumption is the OOM error occurs when the entities are loaded (in EntityReferenceIntegrityEntityHandler::getDependentEntities()), not necessarily when the query itself is run, which only loads the IDs. If we can find a fix at that layer, and leave EntityReferenceIntegrityEntityHandler::referentialEntityQuery() as-is, then I think that would be ideal.

    Tracing the logic to find the best place for this... the entity_reference_integrity_enforce module's FormAlter::formAlter() (which is responsible for adding the list of dependent entities to the delete confirmation form) calls $handler->getDependentEntities($entity); (on this line), before ultimately passing the result of that to FormAlter::buildReferencingEntitiesList(), which simply iterates over the (already loaded) list of entities and makes a list of links to them.

    Side note: I like your approach of loading 11 entities, and then testing for >= 10 in that method. It's an elegant way to do it. So I think we can leave that part of your patch as-is, but we just need to rethink where we limit the query...

    Perhaps when FormAlter::formAlter() calls EntityReferenceIntegrityEntityHandler::getDependentEntities(), it could specify a $limit argument in that method. This could then be passed through to EntityReferenceIntegrityEntityHandler::getDependentEntityIds(), and then to EntityReferenceIntegrityEntityHandler::referentialEntityQuery(). The $limit could default to 0 (no limit), and then we could set it in the context of FormAlter::formAlter(), so it only gets added where we need it.

    Does that make sense?

  • πŸ‡ΊπŸ‡ΈUnited States m.stenta

    Hmm one tricky bit: referentialEntityQuery() returns a list of "all entities of the specified type which have any fields in the source field list that match the given target ID". This gets called by getDependentEntityIds() once for each entity reference field, and then organizes them into sub-arrays by entity type.

    So one call to getDependentEntityIds() results in potentially multiple calls to referentialEntityQuery(). But the $limit would be applied to each call to referentialEntityQuery()... which means calling it with a limit of 10 could result in more than 10 entities being returned.

    This is actually true of the current patch as well.

    So I think we have two options:

    1. Add a $limit parameter only to getDependentEntities() (but not to getDependentEntityIds() or referentialEntityQuery()) and add logic to only load that many entities.
    2. Don't add $limit parameters to any of the methods, and rework FormAlter::formAlter() so that it calls getDependentEntityIds() instead of getDependentEntities(), and load the first 10 ourselves for the purpose of displaying them in the confirmation form.

    I think my preference would be option 2, because it doesn't change the API surface of any methods, and is a targeted fix for this issue.

    It would still be good to add a test for this, to confirm that only 10 entities are shown in the form when there are more than 10 dependent entities. I don't think we necessarily need to expand the existing tests like I described in my previous comment, but it certainly wouldn't hurt either!

  • πŸ‡ΊπŸ‡ΈUnited States m.stenta
  • πŸ‡ΊπŸ‡ΈUnited States m.stenta

    I opened a merge request that takes the approach described above. Setting this back to Needs Review.

    Notably, I realized as I was doing this that it is still possible for the delete confirm form to display more than 10 entities, because it breaks them up by entity type, so rather than reworking all of that I just added a bit of logic to limit the number of entities it loads per type. So in theory it's still possible for this form to create an OOM error, but I think that would be an extreme edge case (multiple separate entity types referencing the same entity). We could take this further to ensure that only 10 entities are loaded overall (regardless of type), but that would require rethinking how we display the results. I decided not to go that far with this, since it should solve the primary issue for 99% of use-cases.

    Since this only affects the FormAlter class now, and since that class doesn't have any test coverage to begin with, I'm going to remove the "Needs Tests" tag. It would be great to add tests for this confirmation form in general, but I don't think that needs to be a blocker for this. I tested it myself locally and it's working, but I'll leave it as "Needs Review" until someone else has a chance to test it themselves to confirm.

    @andrew.wang: If you have a chance to test that would be great! Feel free to set to RTBC if it works for you and you agree with the approach.

Production build 0.71.5 2024