Provide embeddings vector size

Created on 30 September 2024, 4 months ago

Problem/Motivation

In some situations it is useful to know the size of the output vector when generating embeddings. Some providers have a static map (OpenAI for example) while some offer APIs to retrieve this information (Ollama for example).

Proposed resolution

Add a function to Drupal\ai\OperationType\Embeddings\EmbeddingsInterafce that will return vector size for a given model.

Remaining tasks

TBD

User interface changes

None.

API changes

New function added to Drupal\ai\OperationType\Embeddings\EmbeddingsInterafce.

Feature request
Status

Active

Version

1.0

Component

AI Core module

Created by

🇸🇮Slovenia slashrsm

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @slashrsm
  • 🇸🇮Slovenia slashrsm

    The merge request adds the new function to the interface as proposed. It also implements it for Ollama and OpenAI providers. I give a shot at implementing it for the other providers if the approach seems sensible.

  • Pipeline finished with Failed
    4 months ago
    Total: 179s
    #296964
  • 🇩🇪Germany marcus_johansson

    Thanks @slashrsm.

    There is actually a way to get dimensions already and AI Search is using it via the configuration of the model, you can try changing between OpenAI models or Mistral when setting up AI search and it will change to set the right numbers. I didn't actually know that Ollama had a dynamical way of getting them, so that will make things a lot easier, independent on if we do the interface change we should implement this.

    Since its a vital component of embeddings one could argue that it should not be in the configuration object as we do today, but a forced dedicated method instead, like you have solved it. I will loop in @scott_euser and @seogow here, since they have done to most work on AI Search, that would also need to be refactored if we change this.

  • 🇬🇧United Kingdom scott_euser

    Some phpcs issues to fix

    Other than that maybe not 0 as a response, so we can better compare whether the size is provided as I can imagine we could make use of this is Search API backend configuration for validation.

    Thanks!

  • Pipeline finished with Failed
    4 months ago
    Total: 208s
    #297857
  • Pipeline finished with Canceled
    4 months ago
    Total: 77s
    #297863
  • Pipeline finished with Success
    4 months ago
    Total: 178s
    #297864
  • 🇸🇮Slovenia slashrsm

    I fixed the linting errors.

    I am not sure if we should allow 'not provided' as an answer. I think that we should require the vector size in all cases. Is there a valid case for not providing while supporting embeddings? The zero response was meant only as a temporary workaround until we implement it properly for all providers in this issue. If we agree that this is the approach to take I will attempt to do it.

  • 🇩🇪Germany marcus_johansson

    @slashrsm - I think this is a great idea, so please continue.

    You can currently find it under each of the providers file "definitions/api_defaults.yml" what we set as the config value. Under the key embeddings.configuration.dimensions.

    For Huggingface, LMStudio and Ollama we just set some value and if you do not set a value I think the fallback in AI Search is 768. But if its unknown actually feel free to set to 0, that would force people to lookup and set a value when its not known.

    If you finish this, I will add the changes needed in AI Search in the same branch. Another method that we could add later is if the embeddings model allows dynamical values per model - OpenAI for instance allows you to shorten the dimensions to save money.

  • Pipeline finished with Success
    4 months ago
    Total: 251s
    #305028
  • Pipeline finished with Success
    4 months ago
    Total: 324s
    #305074
  • 🇸🇮Slovenia slashrsm

    I implemented the new function for all providers in the module. For some I was not able to find an endpoint that would give me the size. For those I used a workaround where I generated an embedding for a pre-defined string and used the size of the returned array. In order to avoid repetitive calls I also cached the result.

  • 🇩🇪Germany marcus_johansson

    Thanks @slashrsm - I have pushed on change as mentioned to the AI Search to use this as well + I have moved your generic solution to a trait that is loaded in the abstract base so it will work with external providers without them having to touch any code.

    Please have a look and see if its ok for you. I will also ask Scott to have a look at the AI Search changes.

  • 🇬🇧United Kingdom scott_euser

    That's fine I think; just need to make sure we deal with 0 as a response then differently; added one comment as to what I mean

  • 🇬🇧United Kingdom scott_euser

    Sorry I don't get notified when there are comments in gitlab MRs, only drupal issue comments for some reason.
    The validation issue has been addressed.

  • 🇬🇧United Kingdom scott_euser

    Ah think I found the setting at https://git.drupalcode.org/-/profile/notifications - looks like drupal sets it to a drupal.org no reply email by default. Anyways, ready to merge I think right?

  • 🇬🇧United Kingdom scott_euser

    Hmmm not 100% sure if misconfiguration on my side, but just wanted to try this with non-OpenAI and I get this for Ollama mistral 7b:

    #0 /var/www/html/modules/contrib/ai/modules/providers/provider_ollama/src/Plugin/AiProvider/OllamaProvider.php(271): Drupal\provider_ollama\OllamaControlApi->embeddingsVectorSize()
    #1 [internal function]: Drupal\provider_ollama\Plugin\AiProvider\OllamaProvider->embeddingsVectorSize()
    #2 /var/www/html/modules/contrib/ai/src/Plugin/ProviderProxy.php(112): ReflectionMethod->invokeArgs()
    #3 /var/www/html/modules/contrib/ai/src/Plugin/ProviderProxy.php(81): Drupal\ai\Plugin\ProviderProxy->wrapperCall()
    #4 /var/www/html/modules/contrib/ai/modules/ai_search/src/Trait/AiSearchBackendEmbeddingsEngineTrait.php(134): Drupal\ai\Plugin\ProviderProxy->__call()
    #5 /var/www/html/modules/contrib/ai/modules/ai_search/src/Backend/AiSearchBackendPluginBase.php(57): Drupal\ai_search\Backend\AiSearchBackendPluginBase-
    ...

    This is when I have embedding model selected:

    And I then choose embedding engine:

    And I can confirm that:

    1. curl host.docker.internal:11434/api/tags returns that embedding from within ddev
    2. terminal with `OLLAMA_HOST=0.0.0.0 ollama serve` shows POST to /api/show

    So even if its me not configuring something right (not tried Mistral before), I wonder if with either need more validation/error handling there as with this MR I now fail to complete the task, but without the MR I could set the embedding dimensions myself.

  • 🇩🇪Germany marcus_johansson

    We should write in the Drupal comments I guess :)

    I checked this and the problem is that the embedding size variable from the API can basically be called anything. For instance mxbai-embed-large calls it bert.embedding_length and it might be called other keys or not even exist depending on the model. Currently it would only work for models based on Llama, because it had the llama key.

    I pushed a changed the api method so it looks for any key with embedding_size in the end of the key and on top of that, if nothing is found or if it returns 0, it uses the traits method of doing an actual embedding instead to get the number.

    If both those fails we could probably assume that the Ollama configuration is wrong or Ollama is not responding?

  • 🇬🇧United Kingdom scott_euser

    Looking at the docs on that ollama endpoint that seems sensible. There is nothing to indicate it might be something other than *.embedding_length (though in fairness the docs are also not clear that the llama bit could be bert or anything else).

    Looks good to me! I tested it out and no issues now.

  • 🇸🇮Slovenia slashrsm

    Great catch! My bad for only testing on llama. I assumed prefix refers to ollama app, not the model.

  • Pipeline finished with Skipped
    3 months ago
    #309356
  • 🇬🇧United Kingdom scott_euser

    Great thank you everyone! Merged

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024