Add support for AI module VDB Provider

Created on 16 July 2025, about 2 months ago

Problem/Motivation

Add support for an AI module β†’ Vector Database (VDP) provider to allow it to be integrated into the Drupal AI ecosystem.

Steps to reproduce

Proposed resolution

Create a sub-module that acts as a bridge from the AI interfaces to the search API interfaces.

Remaining tasks

User interface changes

API changes

Data model changes

πŸ“Œ Task
Status

Active

Version

3.0

Component

Code

Created by

πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @kim.pepper
  • πŸ‡ΊπŸ‡ΈUnited States drupals.user

    Will this be the replacement of the OpenSearch VDB Provider module?
    https://www.drupal.org/project/ai_vdb_provider_opensearch β†’

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Yes. I discussed this with @fago in slack and we decided to move it here. We can re-use the connection plugin logic and existing search api implementation.

  • Merge request !111Draft: [#3536182] Add support for AI module β†’ (Closed) created by kim.pepper
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Added a draft MR with the basic plugin. We need to implement all the stub methods and save and retrieve connector configuration.

    Also a lot of this is duplicated in \Drupal\search_api_opensearch\Plugin\search_api\backend\OpenSearchBackend so maybe we need a trait?

  • First commit to issue fork.
  • Pipeline finished with Failed
    about 1 month ago
    Total: 289s
    #564945
  • Pipeline finished with Failed
    about 1 month ago
    Total: 198s
    #565851
  • Pipeline finished with Failed
    about 1 month ago
    Total: 301s
    #567685
  • First commit to issue fork.
  • Pipeline finished with Failed
    23 days ago
    Total: 495s
    #577633
  • Pipeline finished with Failed
    23 days ago
    Total: 223s
    #577752
  • Pipeline finished with Failed
    22 days ago
    Total: 386s
    #578565
  • Pipeline finished with Failed
    22 days ago
    Total: 214s
    #578576
  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    The changes I've made allow me to successfully manage OpenSearch indexes. I tested with the standard and aws connectors.

  • Pipeline finished with Failed
    22 days ago
    Total: 258s
    #578597
  • πŸ‡¬πŸ‡§United Kingdom yautja_cetanu

    If you want to track this as part of the AI Initiative stuff and make work on it more visible to FTEs feel free to tag it as AI Initiative and tell me and I'll add this to https://www.drupalstarforge.ai/

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    I think the proper way to handle is to not store connector configuration on the search_api.server configuration, and to instead attach the connector configuration as a dependency to the search_api.server configuration so the connection information is always imported before the server.

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Should we postpone this on πŸ› SearchApiAiSearchBackend should ask configured VDB provider to supply dependencies Active or can we work around it for now?

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    @kim.pepper I don't think it's necessary to postpone. I think the last commit I pushed future proofs it so dependency calculation will start being used as soon as the parent starts asking for the info. You can apply this patch to the ai module to get it working https://git.drupalcode.org/project/ai/-/merge_requests/850.diff.

  • Pipeline finished with Failed
    18 days ago
    Total: 310s
    #581529
  • Pipeline finished with Canceled
    18 days ago
    Total: 83s
    #581532
  • Pipeline finished with Failed
    18 days ago
    Total: 331s
    #581533
  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    I reverted the change introduced by danielveza, it was messing up my collection reads. I think it could have alternatively used explode with a limit of 2, so only the left-most underscore would be the divider. Ultimately I don't think getCollections is the correct place to do this. Realistically it should be a straight return of the index names from OpenSearch with no other manipulation.

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    I think we should clearly define what the scope should be for this issue. Is it stricly porting from the deprecated module to this one? Should it build out the provider a bit more robustly (e.g. being able to delete entries).

    In an ideal world, I think a critical piece to this provider would be to ensure that records can be cleared from the index via deleteItems. Currently, the index will grow indefinitely as old entity documents (all of an entities chunks) are not cleared away, ever. The result is duplicative and stale content every time an entity is saved.

  • Pipeline finished with Failed
    17 days ago
    Total: 241s
    #582434
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    I'm thinking we should get things working here, then push it back to the ai_vdb_provider_opensearch module. Having it as a sub-module while the API is experimental is going to be problematic. We want to keep releases for the search_api_opensearch module fairly stable, while this might need multiple frequent releases until it's stable.

    I've reached out to @fago and @Maximillian Mikus in Slack to see if we can be added as maintainers (if you agree) of ai_vdb_provider_opensearch β†’

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    OpenSearch should successfully be deleting items from the index now. Yay. I was wondering why so many duplicates were appearing...

  • Pipeline finished with Failed
    14 days ago
    #585585
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    I've been granted maintainership of ai_vdb_provider_opensearch so we should just decide when the best time would be to push this MR over there.

  • πŸ‡¦πŸ‡ΊAustralia RichardGaunt Melbourne

    Hi, I've installed it and am using this on a chatbot prototype.
    All works well and was a drop-in replacement for the Milvus vector database that I was using before.
    No errors / bugs have turned up. Will let you know if anything.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Ok great. Thanks for the feedback.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    We should decide on whether to fix the getConnector() fails or just duplicate some of the code and remove the trait.

  • πŸ‡ΊπŸ‡ΈUnited States lpeabody

    What would be a reason to not fix the calls to getConnector? I think it just needs to accept which connector plugin you want to use and the configuration for it? Probably standard and whatever the connection details are to the opensearch instance incorporated into the test? I don't have a lot of experience writing Drupal tests so I'm just spitballing here. I'm also greatly reduced in my capacity to be able to work on this extension, I have rolled off the project that was working on this. At the same time, I feel like this is close to being stabilized and kinda want to see it over the finish line...

Production build 0.71.5 2024