Add a parallel tracker to allow concurrent indexing

Created on 1 June 2025, 2 days ago

Problem/Motivation

The current Search API module only supports single-threaded indexing, which can be very slow for large datasets. When indexing millions of items, a single worker process may take hours or days to complete. This creates performance bottlenecks, especially during initial site setup, data migrations, or when rebuilding indexes.

The Basic tracker uses global locks that prevent multiple indexing processes from running simultaneously, even though the underlying search backends (like Solr, Elasticsearch, or Database) could handle concurrent indexing operations efficiently.

Steps to reproduce

  • 1. Create a large Search API index with thousands of items to be indexed
  • 2. Run `drush search-api:index my_index`
  • 3. While the first command is running, try to run a second `drush search-api:index my_index` command
  • 4. Observe that the second command fails with an error
  • 5. Notice that only one worker can process items at a time, leading to slow indexing performance

Proposed resolution

Implement a new "Parallel" tracker plugin that enables lock-free parallel indexing using database-level worker claims instead of global locks.

All that is needed is then to switch the tracker to Parallel and you can run:

$ drush search-api-mark-all

$ drush search-api:index & drush search-api:index & drush search-api:index & drush search-api:index & drush search-api:index &

and 5 workers will index all items of all indexes in parallel.

The solution includes:

Core Architecture Changes

  • - Move locking logic from Index entity to tracker plugins by extending TrackerInterface with `lock()`, `unlock()`, and `getLockId()` methods
  • - Maintain backward compatibility by having Index delegate to tracker for locking operations

Parallel Tracker Implementation

  • - Create new Parallel tracker extending the Basic tracker
  • - Add `expires_at` and `worker_id` columns to `search_api_item` table for worker claim management
  • - Use brief advisory locks around select-then-claim for race-condition-free item claiming
  • - Implement automatic cleanup of expired worker claims
  • - Configure claim timeout (default: 60 seconds) to handle worker failures

Worker Claim Process

  • 1. Each worker gets a unique UUID when `lock()` is called
  • 2. Workers select available items (worker_id IS NULL) using existing ordering logic
  • 3. Brief advisory lock (5 seconds) prevents MySQL deadlocks during claiming
  • 4. Workers claim selected items by setting their worker_id and expires_at timestamp
  • 5. Successfully indexed items are released; failed workers automatically timeout

Improved User Experience

  • - Enhanced IndexBatchHelper to detect parallel indexing scenarios
  • - Replace confusing "Couldn't index items" errors with helpful status messages
  • - Show "No items available for this worker" instead of errors when other workers handle the work
  • - Eliminate false "less than expected items indexed" warnings during parallel processing

Key Benefits

  • - Drop-in replacement: Works with existing `drush search-api:index` commands
  • - Scalable: Multiple workers can run simultaneously without conflicts
  • - Fault-tolerant: Failed workers automatically release claims via timeout
  • - Backend-agnostic: Works with Database, Solr, Elasticsearch, and other backends
  • - Configurable: Adjustable claim timeouts

Remaining tasks

  • - [ ] Add comprehensive test coverage for parallel indexing scenarios
  • - [ ] Add documentation for parallel indexing setup and configuration
  • - [ ] Performance testing with multiple concurrent workers
  • - [ ] Test with different search backends (Solr, Elasticsearch)
  • - [ ] Consider adding monitoring/logging for worker claim statistics

Note: AI was used for creating this documentation and an augmented AI workflow was used for creating the patch, but all architecture, planning and refinement was my own idea. The code is now exactly how I would have written it - if I had written it 100% manually.

Feature request
Status

Needs review

Version

1.0

Component

General code

Created by

🇩🇪Germany Fabianx

Live updates comments and jobs are added and updated live.
  • Performance

    It affects performance. It is often combined with the Needs profiling tag.

Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024