- Issue created by @marcus_johansson
- 🇬🇧United Kingdom seogow
I believe index should not allow strategy to be changed when it was created. Index is a data structure defined by both strategy and embedding vectors. If something needs to change, a new index should be created.
- 🇩🇪Germany marcus_johansson
But you have the same issue in Solr for instance (and will have here) when you add or remove a field.
We probably will never have an index with 20 fields like you might have in Solr, but it sucks to have to recreate everything, insetad of just having to re-index everything.
- 🇬🇧United Kingdom seogow
The difference between VDB and Solr (when only Keyword search is allowed in Solr) is that Solr will still work the same, if you add/remove a field. The field will just be returned as NULL if not populated. on the other hand, the vector might not just have different dimensions number (hard fail), but also different handling of returned items (soft fail in postprocessing in strategy).
However, I see your point with being able to choose different strategy for the same setup (e.g. during development for testing strategies' outcome). For that case, the index can be emptied (deleted/recreated) in VDB backend, whilst the index configuration stay (the deleting/recreating happens during saving changed index).
TYhis is a destructive operation and we must issue a pop-up warning before allowing saving the changed index.
Adding/removing fields in that case doesn't cause any issues - they will be taken care of during new indexing.
- Status changed to Needs review
5 months ago 2:26pm 6 September 2024 - 🇩🇪Germany marcus_johansson
Just going through bug backlog - I agree with you here, this will be disabled on edit. a pull request exists here: https://www.drupal.org/project/ai/issues/3462030 🐛 Flush index if Embedding Strategy is set Needs review
- 🇬🇧United Kingdom scott_euser
Marcus I think your link links to this issue, I could work on this but sounds like we have a solution somewhere already?
- 🇩🇪Germany marcus_johansson
@scott_euser - sorry, I can't link correctly: https://git.drupalcode.org/project/ai/-/merge_requests/62. Its essentially just disabling the options if you are editing instead of creating.
- 🇬🇧United Kingdom scott_euser
Hmmm I wonder if its a valid use case to want to change the embedding strategy though? Perhaps instead we should have a warning message (drupal messenger) to say something like 'You have changed your embedding strategy, if you intend to keep this change you should requeue all items for re-indexing or you will likely have unexpected results.'
As is in your MR I suppose its also okay, but we force the site builder to change the embedding strategy via config, import, and hopefully know that they should then re-index.
- 🇬🇧United Kingdom scott_euser
We could also actually queue all items for reindexing on change I suppose, but it could be that they change it and want to change it back immediately after seeing the warning...
- Status changed to Postponed: needs info
2 months ago 11:31am 22 November 2024 - 🇬🇧United Kingdom MrDaleSmith
Needs discussion over correct approach as per last 2 comments.
- 🇬🇧United Kingdom seogow
I believe we should not allow changing the embedding strategy for an index. The reason is that some of the strategies intentionally create a single chunk (so the search backend doesn't need to deal with multiple chunks per Entity ID), which can be expected and further processed as is. Changing the strategy would have unwanted consequences.
I suggest this had fixed the issue: https://git.drupalcode.org/project/ai/-/merge_requests/62
Proposal: close this as Done not Doing. I shall close this ticket accordingly on Tuesday, 4 February 2025 if there are no further objections?
- 🇩🇪Germany marcus_johansson
After using it for some while I agree @seogow - most changes when testing out indexes is happening on the fields, not on the embeddings strategy. From my point of view you could close already and also close that MR without merging.