Configurable shards, replicas

Created on 7 May 2025, 3 months ago

Problem/Motivation

Shards, replicas settings for the Elasticsearch index are not exposed; also, BackendClient::addIndex() doesn't expose a way to customize the index settings when creating it.

Proposed resolution

Expose shards, replicas in the Search API server form. Use them in BackendClient.

Remaining tasks

  1. Implement
  2. Review

User interface changes

API changes

None

Data model changes

None

Feature request
Status

Active

Version

8.0

Component

Code

Created by

🇨🇴Colombia jedihe

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jedihe
  • 🇨🇴Colombia jedihe

    Initial implementation, minimally tested.

  • Merge request !93Issue #3523238: Configurable shards, replicas → (Open) created by jedihe
  • Pipeline finished with Success
    3 months ago
    Total: 672s
    #491600
  • 🇫🇮Finland sokru

    Thanks for working on this! Since the proposed solution creates data model changes this would require an update hook.
    The solution opens few questions:
    1. What if user has configured the default number of shards and replicas on elasticsearch.yml index.number_of_replicas, this will be a breaking change for those users.
    2. The code (without update path) defaults to 1 for number of replicas and this will cause the health of a single node cluster to always be yellow (instead of green).

    One solution could be to use or extend IndexCreatedEvent event to achieve the same on the custom module. Elasticsearch provides quite some index settings (https://www.elastic.co/docs/reference/elasticsearch/index-settings/index...) so I'd say that adding new configurations to Drupal UI should be carefully considered.

  • 🇧🇪Belgium borisson_ Mechelen, 🇧🇪

    Agreeing with @sokru that there needs to be consideration when adding additional fields to search api's admin screens.

  • 🇨🇴Colombia jedihe

    Quick finding: static index settings can not be updated after index creation; number_of_shards is a static index setting.

  • Status changed to Needs work about 1 month ago
  • First commit to issue fork.
  • 🇨🇦Canada mparker17 UTC-4

    Taking a look at this issue after the latest update: I don't think that @sokru's questions in #7 have been directly answered/addressed yet, i.e.:

    1. What if user has configured the default number of shards and replicas on elasticsearch.yml index.number_of_replicas, this will be a breaking change for those users.
    2. The code (without update path) defaults to 1 for number of replicas and this will cause the health of a single node cluster to always be yellow (instead of green).

    Also, as @jedihe points out in #9, if we are going to make it possible to specify number_of_shards, we need to find a way to prevent it from being changed after the index has been created (see Support Aliases API and zero downtime mapping updates Active for one possible solution to this; but a simpler way might be to simply block it).

    As a maintainer, it would help me to understand why someone using this module would want to change the default number of shards and replicas for an index from the default. That is to say, Elasticsearch's default number of shards and replicas have always seemed to work well for me, so I'm not sure why I would need to override that (that being said, I'm looking at it from my client's perspective, i.e.: a lexical search backend for 91,000 documents). May I trouble someone to update the issue summary with that information?

    I also notice that there are no tests. Automated tests ultimately benefit you: they ensure that future changes to the module will not break the functionality that you depend on. If you need help writing tests, please ask us!

  • 🇨🇦Canada mparker17 UTC-4

    (link to merge request in issue summary)

Production build 0.71.5 2024