Problem/Motivation
Right now we have the possibility for you to create a recipe that applies any 3rd party module for the AI providers with a default operation type model. This means that you can create recipes without necessarily having to bind them to OpenAI or Ollama or anything else.
We want to do the same with vector databases, meaning that you can have a recipe that installs any vdb provider or even set it up manually.
Currently we have an action, that means that you can define embeddings via the default embeddings and its size and skip that when running this, something like this:
config:
search_api.server.modules_server:
setupVdbServerWithDefaults:
langcode: en
status: true
dependencies:
module:
- ai_search
id: modules_server
name: 'Modules Server'
description: ''
backend: search_api_ai_search
backend_config:
chat_model: litellm__chat
database: postgres
database_settings:
database_name: db
collection: modules
metric: cosine_similarity
embedding_strategy: contextual_chunks
embedding_strategy_configuration:
chunk_size: '3000'
chunk_min_overlap: '100'
contextual_content_max_percentage: '30'
embedding_strategy_details: ''
The problem is that this solution still requires you to setup the backend_config.database which is mapped to one version. Also the backend_config.database_settings.database_name should be possible to setup based on what the vdb provider has as default database - for instance postgres has only one database it always uses.
Proposed resolution
In the AiVdbProviderInterface add a method called getDefaultDatabase that returns string
In the AiVdbProviderClientBase use the method getDefaultDatabase with the value "default" returned
In the ai.schema.yml add another string field called default_vdb_provider
In the src/Form/AiSettingsForm.php add a manual form that loads any VDB provider
On the AI Settings form, manually add another form element that loads all VDB providers - empty is allowed. Default is loaded from the config.
On submit save the updated value.
In the SetupVdbServer config action, if the value backend_config.database is not set, set it to the default value.
In the SetupVdbServer config action, if he value backend_config.database_settings.database_name is not set, create and instance of the VDB server and use the value from the getDefaultDatabase method.
Remaining tasks
We should decide if we need to be able to replace backend_config.chat_model that is used for the token counting. I think "default" is anyway chatgpt-3.5.