- Issue created by @kim.pepper
- πΊπΈUnited States drupals.user
Will this be the replacement of the OpenSearch VDB Provider module?
https://www.drupal.org/project/ai_vdb_provider_opensearch β - π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Yes. I discussed this with @fago in slack and we decided to move it here. We can re-use the connection plugin logic and existing search api implementation.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Added a draft MR with the basic plugin. We need to implement all the stub methods and save and retrieve connector configuration.
Also a lot of this is duplicated in
\Drupal\search_api_opensearch\Plugin\search_api\backend\OpenSearchBackend
so maybe we need a trait? - First commit to issue fork.
- First commit to issue fork.
- πΊπΈUnited States lpeabody
The changes I've made allow me to successfully manage OpenSearch indexes. I tested with the standard and aws connectors.
- π¬π§United Kingdom yautja_cetanu
If you want to track this as part of the AI Initiative stuff and make work on it more visible to FTEs feel free to tag it as AI Initiative and tell me and I'll add this to https://www.drupalstarforge.ai/
- π³πΏNew Zealand danielveza Brisbane, AU
If we want to use Titan embeddings v2 we need the MR from π Could not load the embeddings engine to get the dimensions. Please check the configuration.Error executing "InvokeModel" on "https://bedrock-runtime.us-east-1.amazonaws.com/model/amazon.titan-embed-text-v2%3A0/invoke"; Active
- πΊπΈUnited States lpeabody
I think the proper way to handle is to not store connector configuration on the search_api.server configuration, and to instead attach the connector configuration as a dependency to the search_api.server configuration so the connection information is always imported before the server.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Should we postpone this on π SearchApiAiSearchBackend should ask configured VDB provider to supply dependencies Active or can we work around it for now?
- πΊπΈUnited States lpeabody
@kim.pepper I don't think it's necessary to postpone. I think the last commit I pushed future proofs it so dependency calculation will start being used as soon as the parent starts asking for the info. You can apply this patch to the ai module to get it working https://git.drupalcode.org/project/ai/-/merge_requests/850.diff.
- πΊπΈUnited States lpeabody
I reverted the change introduced by danielveza, it was messing up my collection reads. I think it could have alternatively used explode with a limit of 2, so only the left-most underscore would be the divider. Ultimately I don't think getCollections is the correct place to do this. Realistically it should be a straight return of the index names from OpenSearch with no other manipulation.
- πΊπΈUnited States lpeabody
I think we should clearly define what the scope should be for this issue. Is it stricly porting from the deprecated module to this one? Should it build out the provider a bit more robustly (e.g. being able to delete entries).
In an ideal world, I think a critical piece to this provider would be to ensure that records can be cleared from the index via deleteItems. Currently, the index will grow indefinitely as old entity documents (all of an entities chunks) are not cleared away, ever. The result is duplicative and stale content every time an entity is saved.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
I'm thinking we should get things working here, then push it back to the ai_vdb_provider_opensearch module. Having it as a sub-module while the API is experimental is going to be problematic. We want to keep releases for the search_api_opensearch module fairly stable, while this might need multiple frequent releases until it's stable.
I've reached out to @fago and @Maximillian Mikus in Slack to see if we can be added as maintainers (if you agree) of ai_vdb_provider_opensearch β
- πΊπΈUnited States lpeabody
OpenSearch should successfully be deleting items from the index now. Yay. I was wondering why so many duplicates were appearing...
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
I've been granted maintainership of ai_vdb_provider_opensearch so we should just decide when the best time would be to push this MR over there.
- π¦πΊAustralia RichardGaunt Melbourne
Hi, I've installed it and am using this on a chatbot prototype.
All works well and was a drop-in replacement for the Milvus vector database that I was using before.
No errors / bugs have turned up. Will let you know if anything. - π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Ok great. Thanks for the feedback.
- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
We should decide on whether to fix the
getConnector()
fails or just duplicate some of the code and remove the trait. - πΊπΈUnited States lpeabody
What would be a reason to not fix the calls to getConnector? I think it just needs to accept which connector plugin you want to use and the configuration for it? Probably standard and whatever the connection details are to the opensearch instance incorporated into the test? I don't have a lot of experience writing Drupal tests so I'm just spitballing here. I'm also greatly reduced in my capacity to be able to work on this extension, I have rolled off the project that was working on this. At the same time, I feel like this is close to being stabilized and kinda want to see it over the finish line...