- Merge request !15Issue #3281020: referencedEntities: Use loadMultipleRevisions instead of loadRevision → (Open) created by spleshka
- 🇪🇪Estonia rang501 Viljandi
I can add that it seems to work on our projects, no issues so far.
- 🇨🇭Switzerland berdir Switzerland
#12 hasn't been addressed, as suggested there, this should use the optimized approach that assumes that the default revision is referenced and optimize for that and only fallback to loading the revision if that doesn't match.
loadRevisionMultiple() should only lead to a marginal performance improvement, it just does grouped queries but it does not use a static or persistent cache.
Also, I recommend to _not_ use this method at all, at least not until it's optimized. just loop over the items, that will be faster assuming you are viewing mostly default revisions, it will do single loads but from cache.
- 🇮🇳India lhuria94
Ok we want to use it only specifically JSON:API Search API endpoints, so nothing custom over there.
And after applying the patch we saw a considerable improvement overall. Like earlier it used to to take 15s and it has gone to half.
- 🇨🇭Switzerland berdir Switzerland
Right, json:api uses that. Not sure if that's a good idea, but out of scope.
What I'm proposing isn't rocket science, just do a loadMultiple() call first, compare the revision ID and follow-up with a loadRevisionMultiple() for those where it doesn't match. The result, assuming you actually have default revisions, should be a lot better than half of 15s.
- 🇮🇳India lhuria94
Gotcha, Thanks for the input. Appreciate it.
We will try that, compare and contrast and hopefully put up here a optimised patch if it works all good. - 🇮🇳India sumit-k
Hi Team,
Have done some optimization as suggested by berdir #17. Used loadMultiple(), Earlier API call was taking around 30s, can notice 15-20s improvements after changes. Sharing patch for review.
- 🇨🇭Switzerland berdir Switzerland
You have a typo in there (enity ids), you also need to check the loaded entity if it's matching the revision id.
- 🇪🇪Estonia rang501 Viljandi
About #19, doesn't
$item->entity->id()
cause extra separate entity load?
Although, it seems to improve saving time, in my case ~200 less db queries (originally ~3600, #7 ~2900, #19 ~2700).
- 🇨🇭Switzerland berdir Switzerland
Yes, that already loads the entity, correct. It should use $item->target_id instead.
- 🇮🇳India sumit-k
Thanks for reviewing berdir and rang501. Sharing updated patch. Here trying to load the same entity which has a revision id.
elseif ($item->target_revision_id !== NULL) { $ids[$delta] = $item->target_revision_id; $entity_ids[$item->target_revision_id] = $item->target_id; }
Loading the same entity_ids.
if ($ids) { $target_type = $this->getFieldDefinition()->getSetting('target_type'); $storage = \Drupal::entityTypeManager()->getStorage($target_type); $entities = $storage->loadMultiple($entity_ids);
- 🇪🇪Estonia rang501 Viljandi
The new patch seem to give same results, dropping from 2900 queries to 2700.
It seems that it improves validation in ParagraphsLibraryItemHasAllowedParagraphsTypeConstraintValidator which caused 200 queries before because it scans all paragraphs.
- 🇩🇪Germany h1nds1ght
loadRevisionMultiple() should only lead to a marginal performance improvement, it just does grouped queries but it does not use a static or persistent cache.
Also, I recommend to _not_ use this method at all, at least not until it's optimized. just loop over the items, that will be faster assuming you are viewing mostly default revisions, it will do single loads but from cache.Well
loadRevision
executesloadMultipleRevisions
, so they share the same drawbacks, except thatloadMultipleRevisions
on multiple revision ids reduces the amount of queries, which on the other hand has a larger effect the more revisions you reference.But using
loadMultiple
is definitely a point, since it will ensure using the entity cache system, which is unfortunately not available for revisions and I think this is out of scope of this issue.So, why not combining both? Loading default revisions with
loadMultiple
and everything else withloadMultipleRevisions
? - 🇨🇭Switzerland berdir Switzerland
> So, why not combining both? Loading default revisions with loadMultiple and everything else with loadMultipleRevisions?
That is exactly what I'm saying says, and the patches are moving in that direction, but are not yet correctly checking that.
The problm is that you don't know if a id/revision id pair is the default revision or not, so it needs to be loaded and then the revision id compared and reloaded if not. See #12/15.
- 🇩🇪Germany h1nds1ght
OK, created another patch. The problem is a little bit that
loadMultiple
has entity_ids as keys and we cannot rely on that entity_id = revision_id (the last patch did so). This makes it somewhat uncomfortable but one can resolve that with another loop. I checked for revision_ids now, and not forisDefaultRevision
, since I think we are interested in revisions here and each revision_id should be unique per storage and should belong to a specific entity_id. But maybe this assumption is wrong. If so we need to add the check forisDefaultRevision
.Also I am not sure if the checks for
RevisionableInterface
andRevisionableStorageInterface
are really necessary. If we can forget aboutRevisionableStorageInterface
we can also remove the lastloadRevision
.One thing that comes in my mind are deleted revisions. With that approach a deleted revision might involve 3 db queries.
- 🇨🇭Switzerland berdir Switzerland
+++ b/src/EntityReferenceRevisionsFieldItemList.php @@ -20,26 +21,36 @@ class EntityReferenceRevisionsFieldItemList extends EntityReferenceFieldItemList + if ($entity instanceof RevisionableInterface && in_array($entity->getRevisionId(), $revision_ids)) { + unset($ids[$entity->getRevisionId()]); + $entities[$entity->getRevisionId()] = $entity; + } + }
you need to unset in $revision_ids not the $ids?
- 🇩🇪Germany h1nds1ght
Can't see why. Note that i renamed the variable $ids to $revision_ids in the first loop. $revision_ids now holds the delta => revision_id relationship for the field list and $ids is a mapping of revision_id => entity_id, which is only used as temporary variable for loadMultiple and loadMultipleRevisions.
If I remove entries from $revision_ids they are also removed from the final field list.
- 🇨🇭Switzerland berdir Switzerland
The logic there is to not load the revisions of references that we matched, $ids is not used anymore after that. Right now you still load all revisions again?
- 🇩🇪Germany h1nds1ght
Not sure if I get your point.
$ids is a mapping of revision_id => entity_id. The loadMultiple call should therefore load each entity in its current revision. The corresponding loaded revision_id of the loaded entity is removed from $ids and the entity aka current revision is added to $entities.
+ foreach ($storage->loadMultiple($ids) as $entity) { + if ($entity instanceof RevisionableInterface && in_array($entity->getRevisionId(), $revision_ids)) { + unset($ids[$entity->getRevisionId()]); + $entities[$entity->getRevisionId()] = $entity; + } + }
$ids is then used one line after that again:
+ $entities += $storage instanceof RevisionableStorageInterface ? $storage->loadMultipleRevisions(array_flip($ids)) : [];
and loads the remaining revisions, which haven't been loaded in the first step by loadMultiple (the array_flip passes the remaining revision ids). This should be in most cases an empty array, except if the current revision of an loaded entity does not equal an referenced revision. The call for loadMultipleRevisions should then deliver an array with revisions keyed by revision_id, and the result is then added to $entities, which should add the missing revisions since it is also an array of revisions keyed by revision_id.
Finally the collected $entities (aka revisions keyed by revision_id) is used for the field list, with a fallback to the default loadRevision (= current solution).
+ foreach ($revision_ids as $delta => $revision_id) { + $entity = $entities[$revision_id] ?? $storage->loadRevision($revision_id);
As mentioned this last loadRevision seems somehow unnecessary.
Maybe this can be written more reasonable, but I can't see what should be wrong with that approach.
- 🇪🇪Estonia rang501 Viljandi
Is this still relevant after 🐛 referencedEntities() causes data loss Needs review ?
- 🇩🇪Germany h1nds1ght
Yes I think so, because the topic here is that EntityReferenceRevisionsFieldItemList::referencedEntities loads referenced entities using single entity loads, which still happens via $item->entity.
public function referencedEntities() { $target_entities = []; foreach ($this->list as $delta => $item) { if ($item->entity) { $target_entities[$delta] = $item->entity; } } return $target_entities; }
EntityReferenceFieldItemList::referencedEntities i.e. uses loadMultiple...
- First commit to issue fork.
- 🇧🇪Belgium dieterholvoet Brussels
dieterholvoet → changed the visibility of the branch 8.x-1.x to hidden.
- Merge request !61Add rebased version of entity_reference_revisions-performance-improvement-3281020-2.patch → (Open) created by dieterholvoet