Vector dimension mismatch when using Zilliz

Issue created by @ultimike
Comment 11 months ago →
jiangc
This seems an issue in the implementation of Drupal Embeddings Engine or Plugin, unrelated to Milvus. The Drupal code fails to create a collection in Milvus with vector dimension according to the user input. For example, the input is

“OpenAI | text-embedding-3-small" as the "Embeddings Engine", “Dimensions” is set to 1536

While the collection created in Milvus specifies vector field of 768 dimension. It sounds like the dimension is hard coded, or there is a bug in specifying that.

I didn’t find the code of Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->insertIntoCollection() in GitHub. The code of Embeddings Engine or Plugin should reveal the root cause of the problem.
Assigned to marcus_johansson
Comment 11 months ago →
🇩🇪Germany marcus_johansson
Comment 11 months ago →
System Message

marcus_johansson → committed 2695f93a on 1.0.x
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 11 months ago →
System Message

marcus_johansson → committed d360a6c6 on 1.0.x
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 11 months ago →
🇩🇪Germany marcus_johansson
@ultimike - there were one bug and one usability issue here, that could have caused this, where the bug is most likely the culprit for you.

Bug - Dimension size ajax loading

If you use a standard model for embeddings from OpenAI, Mistral, Fireworks AI the dimension size is know, so when you choose it in the select list on the Search API Backend form, it should automatically update that value for you via Ajax.

This got broken in dev, because we opened up the possibility to set manual values for providers like LM Studio or Ollama, where the dimension size is unknown for us. When we opened up that, the state system thought that any change to this was a manual change and kept the initial 768 value.

This has now been solved by this being disabled by default and having a check box that you have to check that says "Set Dimensions Manually", that changes so you can manually change the state of this value, but as long as you keep it unchecked and use common embeddings engines it will fill it in for you automaically correctly.

Im 99% certain this is what you ran into.

Usability - Allowing changes to embeddings engine

Currently we had the embeddings engine allowed to be changed after it was set. Since we do not rebuild the whole index on changing this, it would revert to a broken state. We figured now that if you want to actually use a new embeddings engine, you should create a new backend and remove the old one, so we have made this disabled after editing.

I don't think this was the issue, but this can also cause similar problems, so this has been fixed as part of this ticket as well.

I have pushed these changes to 1.0.x-dev release, feel free to test them if they work better. My suggestion is to remove any backend you have and create a new one.

I will keep this ticket open, until I have written regression functional test for this, since it worked in alpha6, but got broken here.

Thanks for reporting!
Issue was unassigned.
Status changed to Fixed 10 months ago4:04am 3 September 2024
Comment 10 months ago →
🇬🇧United Kingdom scott_euser
Thanks for sorting it Marcus!
Comment 10 months ago →
System Message

marcus_johansson → committed d360a6c6 on 3456770-discuss-interface-suggestion
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 10 months ago →
System Message

marcus_johansson → committed 2695f93a on 3456770-discuss-interface-suggestion
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 10 months ago →
System Message

marcus_johansson → committed d360a6c6 on aws-bedrock
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 10 months ago →
System Message

marcus_johansson → committed 2695f93a on aws-bedrock
Issue #3471259 by marcus_johansson: Vector dimension mismatch when using...
Comment 10 months ago →
System Message
Automatically closed - issue fixed for 2 weeks with no activity.

Vector dimension mismatch when using Zilliz

Problem/Motivation

Steps to reproduce

Comments & Activities

Bug - Dimension size ajax loading

Usability - Allowing changes to embeddings engine