Make vector databases abstracted and installable for recipes - Part 2

Created on 19 June 2025, 14 days ago

Problem/Motivation

Right now we have the possibility for you to create a recipe that applies any 3rd party module for the AI providers with a default operation type model. This means that you can create recipes without necessarily having to bind them to OpenAI or Ollama or anything else.

We want to do the same with vector databases, meaning that you can have a recipe that installs any vdb provider or even set it up manually.

Currently we have an action, that means that you can define embeddings via the default embeddings and its size and skip that when running this, something like this:

config:
  search_api.server.modules_server:
      setupVdbServerWithDefaults:
        langcode: en
        status: true
        dependencies:
          module:
            - ai_search
        id: modules_server
        name: 'Modules Server'
        description: ''
        backend: search_api_ai_search
        backend_config:
          chat_model: litellm__chat
          database: postgres
          database_settings:
            database_name: db
            collection: modules
            metric: cosine_similarity
          embedding_strategy: contextual_chunks
          embedding_strategy_configuration:
            chunk_size: '3000'
            chunk_min_overlap: '100'
            contextual_content_max_percentage: '30'
          embedding_strategy_details: ''

The problem is that this solution still requires you to setup the backend_config.database which is mapped to one version. Also the backend_config.database_settings.database_name should be possible to setup based on what the vdb provider has as default database - for instance postgres has only one database it always uses.

Proposed resolution

In the AiVdbProviderInterface add a method called getDefaultDatabase that returns string
In the AiVdbProviderClientBase use the method getDefaultDatabase with the value "default" returned
In the ai.schema.yml add another string field called default_vdb_provider
In the src/Form/AiSettingsForm.php add a manual form that loads any VDB provider
On the AI Settings form, manually add another form element that loads all VDB providers - empty is allowed. Default is loaded from the config.
On submit save the updated value.
In the SetupVdbServer config action, if the value backend_config.database is not set, set it to the default value.
In the SetupVdbServer config action, if he value backend_config.database_settings.database_name is not set, create and instance of the VDB server and use the value from the getDefaultDatabase method.

Remaining tasks

We should decide if we need to be able to replace backend_config.chat_model that is used for the token counting. I think "default" is anyway chatgpt-3.5.

Feature request
Status

Active

Version

1.2

Component

AI Core module

Created by

🇩🇪Germany marcus_johansson

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Production build 0.71.5 2024