XXX items could not be indexed. Check logs for details.

Created on 20 June 2023, almost 2 years ago
Updated 6 January 2024, over 1 year ago

Problem/Motivation

Users may see the warning message that some items are not indexed if the search_api cron job runs during the UI indexing. This isn't very clear, because in fact all items are indexed, but you see the message that something is not indexed.

Steps to reproduce

1. Install the latest Drupal 10 version, search_api, ultimate_cron, and devel generate module.
2. Enable Database Search, Database Search Defaults, Search API, Ultimate Cron, and Devel Generate modules. Disable the Database Search Defaults module.
3. Configure crontab to run system cron every 5 min. Configure Search API cron job to run every 5 min as well.
4. Disable the "Index items immediately" option for the Default content index.
5. Generate 20000 nodes of Article, Basic page bundles.
6. Go to the "/admin/config/search/search-api/index/default_index"
7. Clear all indexed data for the Default content index.
8. Start indexing content for the Default content index using the admin interface.
9. Make sure that cron was run during the batch process.
10. When the batch job finishes, observe the warning message "XXX items could not be indexed. Check logs for details." Please note that amount of items that could not be indexed and the warning message itself occurs randomly. Also, there are no logs for this warning message.

Proposed resolution

I believe that this happens when the search_api cron job runs, it processes some amount of items that are supposed to be processed by the UI batch. And this breaks the calculation of the successfully indexed items by the UI batch. Maybe it is worth implementing the lock state for the index when cron or UI indexing items to it.

🐛 Bug report
Status

Fixed

Version

1.0

Component

User interface

Created by

🇺🇦Ukraine id.aleks

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @id.aleks
  • Status changed to Needs review over 1 year ago
  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update over 1 year ago
    542 pass, 2 fail
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for reporting this issue!
    I see how this might happen. Unfortunately, with our current architecture it’s not really possible to tell how many items actually failed to be indexed as Index::indexItems() doesn’t return that information.
    The easiest fix would probably be to just phrase that warning message more carefully – up to XXX items might not have been indexed, but it’s also possible everything was fine. (See the attached patch.) That might be a bit less confusing, but it’s of course still not very helpful.

    You’re probably right that the correct solution here is to apply locking when indexing so that two separate processes don’t index items concurrently – especially since I think this will also sometimes mean that the same item is indexed more than once, which is just a waste of ressources. (In fact, this seems like it would be much more common than the problem reported here, but just not really noticeable for the user.)
    However, this would be a bit tricky to implement. I’ll try to look into it, but would also be very grateful for any patches.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update over 1 year ago
    544 pass
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Fixing the test fail.

  • Open in Jenkins → Open on Drupal.org →
    Core: 9.5.x + Environment: PHP 8.1 & sqlite-3.27
    last update over 1 year ago
    537 pass, 12 fail
  • 🇦🇹Austria drunken monkey Vienna, Austria

    This would additionally implement a simple locking mechanism for indexing operations. (I think we only need to lock those for “index N items”, not those that index specific items, e.g., when saving an entity. However, this means that saving an entity while an indexing batch is in process might still lead to the described misleading behavior. I’d therefore say we should also change the warning message, as proposed.)

    I guess this might still run into problems sometimes, when a cron job is executed right between two batch processing steps, but I don’t think it’s worth it to try and guard against that. (We’d essentially need to implement a lock that stays active across requests, which seems overly complicated.)

    Please test/review!

  • Status changed to Needs work over 1 year ago
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Waiting for feedback before I try to fix the latest patch. First want to determine whether locking is even the right approach here.

  • 🇺🇦Ukraine id.aleks

    Hello @drunken monkey,

    Thank you for your work on this issue. Indeed, the situation has improved significantly with the patch. The wording of the message is clear, and the locking system has definitely resolved my issue. While I agree that the patch might not fix all possible cases, it's better than having no solution at all. The scenarios you mentioned are less common compared to this specific one, so I'm satisfied with this patch.

    I do have one note regarding an improvement: When a SearchApiException is produced due to an available lock, the exception message is "Items are being indexed in a different process." However, in the IndexStatusForm::submitForm, the system catches this exception and only displays a warning message: "Failed to create a batch, please check the batch size and limit." I suggest aligning the warning message with the exception message for clearer communication to the users about what is happening.

    Here is my test result:



  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks a lot for your feedback, good to hear the patch worked for you and you agree with its direction.

    I do have one note regarding an improvement: When a SearchApiException is produced due to an available lock, the exception message is "Items are being indexed in a different process." However, in the IndexStatusForm::submitForm, the system catches this exception and only displays a warning message: "Failed to create a batch, please check the batch size and limit." I suggest aligning the warning message with the exception message for clearer communication to the users about what is happening.

    Also a very good find, thanks! In the attached revision I’m checking explicitly for the lock in submitForm() and display this warning if it’s not available:

    Cannot start indexing because another indexing process (like a cron job or Drush command) is already running.

    Does that sound good to you?

    The patch should hopefully also resolve the test fails.

  • Status changed to Needs review over 1 year ago
  • Open in Jenkins → Open on Drupal.org →
    Core: 10.0.11 + Environment: PHP 8.2 & sqlite-3.34
    last update over 1 year ago
    545 pass
  • 🇦🇹Austria drunken monkey Vienna, Austria
  • Status changed to RTBC over 1 year ago
  • 🇺🇦Ukraine id.aleks

    Hello @drunkey_monkey,

    Thank you once again for your contribution. The rephrased error message is now much clearer. From an end-user's perspective, it's more understandable what is occurring. All tests are passing, so it looks good to me. I'm marking this issue as RTBC!

  • Status changed to Fixed over 1 year ago
  • 🇦🇹Austria drunken monkey Vienna, Austria

    Thanks for reporting back, great to hear!
    Merged. Thanks again!

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024