Add a parallel tracker to allow concurrent indexing

Created on 1 June 2025, 3 months ago

Problem/Motivation

The current Search API module only supports single-threaded indexing, which can be very slow for large datasets. When indexing millions of items, a single worker process may take hours or days to complete. This creates performance bottlenecks, especially during initial site setup, data migrations, or when rebuilding indexes.

The Basic tracker uses global locks that prevent multiple indexing processes from running simultaneously, even though the underlying search backends (like Solr, Elasticsearch, or Database) could handle concurrent indexing operations efficiently.

Steps to reproduce

  • 1. Create a large Search API index with thousands of items to be indexed
  • 2. Run `drush search-api:index my_index`
  • 3. While the first command is running, try to run a second `drush search-api:index my_index` command
  • 4. Observe that the second command fails with an error
  • 5. Notice that only one worker can process items at a time, leading to slow indexing performance

Proposed resolution

Implement a new "Parallel" tracker plugin that enables lock-free parallel indexing using database-level worker claims instead of global locks.

All that is needed is then to switch the tracker to Parallel and you can run:

$ drush search-api-mark-all

$ drush search-api:index & drush search-api:index & drush search-api:index & drush search-api:index & drush search-api:index &

and 5 workers will index all items of all indexes in parallel.

The solution includes:

Core Architecture Changes

  • - Move locking logic from Index entity to tracker plugins by extending TrackerInterface with `lock()`, `unlock()`, and `getLockId()` methods
  • - Maintain backward compatibility by having Index delegate to tracker for locking operations

Parallel Tracker Implementation

  • - Create new Parallel tracker extending the Basic tracker
  • - Add `expires_at` and `worker_id` columns to `search_api_item` table for worker claim management
  • - Use brief advisory locks around select-then-claim for race-condition-free item claiming
  • - Implement automatic cleanup of expired worker claims
  • - Configure claim timeout (default: 60 seconds) to handle worker failures

Worker Claim Process

  • 1. Each worker gets a unique UUID when `lock()` is called
  • 2. Workers select available items (worker_id IS NULL) using existing ordering logic
  • 3. Brief advisory lock (5 seconds) prevents MySQL deadlocks during claiming
  • 4. Workers claim selected items by setting their worker_id and expires_at timestamp
  • 5. Successfully indexed items are released; failed workers automatically timeout

Improved User Experience

  • - Enhanced IndexBatchHelper to detect parallel indexing scenarios
  • - Replace confusing "Couldn't index items" errors with helpful status messages
  • - Show "No items available for this worker" instead of errors when other workers handle the work
  • - Eliminate false "less than expected items indexed" warnings during parallel processing

Key Benefits

  • - Drop-in replacement: Works with existing `drush search-api:index` commands
  • - Scalable: Multiple workers can run simultaneously without conflicts
  • - Fault-tolerant: Failed workers automatically release claims via timeout
  • - Backend-agnostic: Works with Database, Solr, Elasticsearch, and other backends
  • - Configurable: Adjustable claim timeouts

Remaining tasks

  • - [ ] Add comprehensive test coverage for parallel indexing scenarios
  • - [ ] Add documentation for parallel indexing setup and configuration
  • - [ ] Performance testing with multiple concurrent workers
  • - [ ] Test with different search backends (Solr, Elasticsearch)
  • - [ ] Consider adding monitoring/logging for worker claim statistics

Note: AI was used for creating this documentation and an augmented AI workflow was used for creating the patch, but all architecture, planning and refinement was my own idea. The code is now exactly how I would have written it - if I had written it 100% manually.

Feature request
Status

Needs review

Version

1.0

Component

General code

Created by

🇩🇪Germany Fabianx

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @Fabianx
  • 🇩🇪Germany Fabianx

    This was loosely inspired also by:

    Issue #3463409 by mkalkbrenner: Parallel indexing using concurrent drush processes

    which showed that parallel indexing is possible in general.

  • Pipeline finished with Success
    3 months ago
    Total: 532s
    #511461
  • Pipeline finished with Success
    3 months ago
    Total: 466s
    #511980
  • 🇦🇹Austria drunken monkey Vienna, Austria

    drunken monkey made their first commit to this issue’s fork.

  • 🇦🇹Austria drunken monkey Vienna, Austria

    Amazing job, thanks a lot for this!
    Went through the code and largely just adjusted the code style according to my preferences, but looked pretty good already.
    Additionally, I switched the default tracker for newly created indexes to parallel – if this works well enough, I think there is no real reason to keep the Basic tracker at all. Or are there any actual downsides to this? Anyways, on second thought, we might want to first give this change some real life experience with advanced users (that manually switch the tracker) before changing the default and then maybe even deprecating the Basic tracker. ( 📌 Move methods from Basic tracker plugin to TrackerPluginBase Active would be a prerequisite for the latter.)
    I also wonder if we shouldn’t remove the claim_timeout option from the UI, see my comment on the MR.

    Anyways, what is definitely still missing is test coverage for this. Though I’m not quite sure how we could properly test this – PHP famously isn’t really built for parallel processing.
    I don’t really think we need the rest of the “Remaining tasks”, though some benchmarks for the kind of improvement you could achieve would surely be nice.

    Thanks a lot again, great work!

  • First commit to issue fork.
  • Pipeline finished with Failed
    5 days ago
    Total: 353s
    #590822
  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Not adding new test coverage, but I fixed the existing tests. Especially the missing schema would make this patch throw exceptions when used on an actual project.

  • Pipeline finished with Failed
    5 days ago
    Total: 1049s
    #590840
  • 🇦🇹Austria drunken monkey Vienna, Austria

    @kristiaanvandeneynde: Thanks, nice job! Especially great that you found the problem with SearchApiDbUpdate8102Test, I think that would have utterly confounded me.
    The tests being marked as failed is (probably) due to the deprecation, which we already fixed in HEAD. Merging in latest changes, the pipeline should be green now.

  • 🇧🇪Belgium kristiaanvandeneynde Antwerp, Belgium

    Indeed it is :)

Production build 0.71.5 2024