Problem/Motivation
We have a quite large index which also stores vector embeddings. AWS backups are non trivial to restore and re-indexing is very slow as search API doesn't allow concurrent indexing out of the box (yet). In another issue we added code that only clears the index when necessary (if a field mapping is changed for example), however it is not fool proof and so far the index has been unexpectedly cleared once. Because there is no logging in the method that checks if mappings were changed I don't actually know why it was cleared and it didn't happen locally or in staging environments. In addition, OpenSearch has dynamic typing so if someone forgets to add the explicit type it could also cause a difference and result in the mappings being cleared.
Steps to reproduce
N/A
Proposed resolution
Add a new advanced option to the opensearch backend called safety_mode, which allows you to turn safety_mode on or off for the indexes you want it enabled for. If we detect that mappings have changed but safety mode is enabled the index won't be cleared and a SearchApiException will be thrown.
In addition we should add logging during mapping detection so that if something changed the developer knows why.
Also, currently in mappingsHaveDifferences we return TRUE if the OpenSearch server returns an exception, which imo is not the correct approach since we should err on the side of caution and throw a SearchApiException if the server has an error.
When an exception is thrown search api creates a pending task which will try and re-run the updates when executed. If the server was temporarily not responsive, this might be ok, otherwise the developer can delete the task from search_api_task table.
Remaining tasks
Create MR.
User interface changes
New safety_mode option in the advanced section of the server page.
API changes
N/A
Data model changes
N/A