Prevent accidentally clearing the index

Created on 26 August 2025, 3 months ago

Problem/Motivation

We have a quite large index which also stores vector embeddings. AWS backups are non trivial to restore and re-indexing is very slow as search API doesn't allow concurrent indexing out of the box (yet). In another issue we added code that only clears the index when necessary (if a field mapping is changed for example), however it is not fool proof and so far the index has been unexpectedly cleared once. Because there is no logging in the method that checks if mappings were changed I don't actually know why it was cleared and it didn't happen locally or in staging environments. In addition, OpenSearch has dynamic typing so if someone forgets to add the explicit type it could also cause a difference and result in the mappings being cleared.

Steps to reproduce

N/A

Proposed resolution

Add a new advanced option to the opensearch backend called safety_mode, which allows you to turn safety_mode on or off for the indexes you want it enabled for. If we detect that mappings have changed but safety mode is enabled the index won't be cleared and a SearchApiException will be thrown.

In addition we should add logging during mapping detection so that if something changed the developer knows why.

Also, currently in mappingsHaveDifferences we return TRUE if the OpenSearch server returns an exception, which imo is not the correct approach since we should err on the side of caution and throw a SearchApiException if the server has an error.

When an exception is thrown search api creates a pending task which will try and re-run the updates when executed. If the server was temporarily not responsive, this might be ok, otherwise the developer can delete the task from search_api_task table.

Remaining tasks

Create MR.

User interface changes

New safety_mode option in the advanced section of the server page.

API changes

N/A

Data model changes

N/A

Feature request
Status

Active

Version

3.0

Component

Code

Created by

achap 🇦🇺

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @achap
  • Merge request !114Draft: Add safety_mode option → (Open) created by achap
  • Pipeline finished with Success
    3 months ago
    Total: 363s
    #581729
  • achap 🇦🇺
    • Added the safety_mode option to the server backend.
    • Added logging to the mappingsHaveDifferences method so we know why something is different.
    • If getting mappings doesn't succeed for some reason throw an exception.
    • Removed a conditional from mappingsHaveDifferences which I think was redundant.
    • Throw an exception if safety mode is activated

    I also noticed that when a task is created and you try to re-run it (after disabling safety mode) there is an infinite loop because when the task is retried updateIndex calls $index->clear() which fires off an event to execute and tasks, which calls updateIndex, which calls $index->clear() etc. I think that's a search_api bug though.

    So the developer will have to manually delete the search_api_task.

  • Pipeline finished with Success
    about 1 month ago
    Total: 211s
    #614548
Production build 0.71.5 2024