Warning: Could not load the following items on index ...

Created on 5 October 2022, over 2 years ago
Updated 1 March 2023, almost 2 years ago

Problem/Motivation

We are observing warnings triggered by instant indexing of user's profile:
"Could not load the following items on index Engagements search: "entity:growth_engagement/317:und", "entity:growth_engagement/317:zxx"."

growth_engagement is an untranslatable (translatable = FALSE) custom entity, that refers to profiles. This relation is part of search API index.

After some debugging, I found at, that ContentEntity::getAffectedItemsForEntityChange, returns non existing language variations for growth_engagement, eg:

["entity:growth_engagement/317:und", "entity:growth_engagement/317:zxx", "entity:growth_engagement/317:en"]

In our context, database includes just "en" version of entity. As und and zxx versions gets scheduled for index, search API triggers warning as it obviously fails to load non-existing language versions.

Proposed resolution

My proposition is to check for referenced entities what language version they actually have. I'll include patch for review. I'm not fully happy with patch (although it fixed issue for me), as it loads entities objects, instead of leaving just entity ids for further processing.

🐛 Bug report
Status

Active

Version

1.25

Component

General code

Created by

🇵🇱Poland jsobiecki Wroclaw

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇨🇿Czech Republic siva01

    I have search_api 1.28.0 and patch fix that issue for me.

  • Status changed to Needs review almost 2 years ago
  • 🇵🇹Portugal rutiolma

    This patch fixes the problem but in my case it introduces a huge overload. I tried a similar approach which seems to be lighter.

  • 🇵🇹Portugal rutiolma

    Fixing previous patch errors.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update over 1 year ago
    521 pass, 2 fail
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update over 1 year ago
    538 pass
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks a lot for reporting this issue and already finding a solution.
    Both your analysis and the solution make sense, even though it’s of course unfortunate that we have to load a potentially large number of entities just to get their translations. But yes, no way around that, I fear – unless we just want people to have to live with the error in the log messages, which doesn’t really do any harm, either, as far as I can see. (Also, there is no error at all if “Index items immediately” is disabled.)
    So maybe we should actually make this fix optional? Not sure. I might at least mention it in the release notes for the next release.

    Anyways, attached is a slightly modified version of the patch, complete with a regression test. Please test/review, and tell me your opinion on the above.

  • First commit to issue fork.
  • Pipeline finished with Success
    4 months ago
    Total: 341s
    #294147
  • 🇧🇪Belgium dieterholvoet Brussels

    I added the changes from the last patch to the MR and rebased it on 8.x-1.x. I also pushed an improvement to the entity loading code: it splits the loading into chunks of (by default) 50 entities a time. This should prevent any memory issues if there are a lot of entities to load.

  • 🇧🇪Belgium dieterholvoet Brussels

    I think we should either commit the fix or remove the warning. Because now, for me, that warning has no meaning. I just ignore it by default because I know that most times there's no actual problem. And if people are ignoring it anyway, might as well get rid of it to reduce the noise.

  • Pipeline finished with Success
    4 months ago
    Total: 430s
    #294272
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for creating an MR from my patch and adding the chunking, good idea!
    I added some small style fixes, would be ready to merge code-wise now in my opinion, if we want to go that route.

    However, since this is just about avoiding the warnings, I think there might be a better way (i.e., one that doesn’t cause performance problems) after all: When queueing the items for reindexing, the component in question (in this case, the ContentEntity datasource) could just tell the index “hey, I’m sending you these IDs for indexing, but I’m not actually sure they exist”. That way, the index can just skip logging warnings for those IDs.
    I created an MR based on that idea, please test/review.
    I’m also open to other suggestions for the architecture of passing that information to the index, the one I came up with was just a method and static property on the index.

  • codebymikey changed the visibility of the branch 3313569-fix-load-items-warning-v2 to hidden.

  • codebymikey changed the visibility of the branch 3313569-fix-load-items-warning-v2 to active.

  • 🇧🇪Belgium dieterholvoet Brussels

    Yeah sure, that works for me too. Do I understand correctly that all content entities are now considered unreliable items? With that in mind, don't you think it would be better to just remove the warning? Or are there other data sources out there that could profit from the warning being present?

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Do I understand correctly that all content entities are now considered unreliable items?

    No, this is only for any item IDs queued for immediate indexing when it is unclear whether the object represented by that ID actually exists on the site. It would specifically be used for this scenario of marking items as “changed” when a referenced entity changes, the warning would still apply in any other circumstances, when an item is marked as “changed” (and “Index items immediately” is enabled) without the item ID also being marked as “unreliable”.

  • 🇧🇪Belgium dieterholvoet Brussels

    Okay, sounds good! If this solution doesn’t have the possibility to cause performance problems it's probably the better one.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Good to hear, thanks for the feedback!
    Since we’re adding a new API here, which is then pretty much set in stone, I’d still like to leave this open for a few weeks in case anyone wants to comment on the proposed architecture for this (even though the change is pretty small). I’ve also pointed people on Slack to this issue.

  • Pipeline finished with Success
    about 2 months ago
    Total: 436s
    #354449
  • Status changed to Fixed about 2 months ago
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for merging HEAD, that seems to have fixed the CI fails!
    I think that was long enough for feedback, so I went ahead and merged this.
    Thanks again!

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024