Add support for neural search (text embeddings)

Issue created by @slashrsm
Comment 10 months ago →
🇸🇮Slovenia slashrsm
Status changed to Needs review 10 months ago4:38pm 7 September 2024
Comment 10 months ago →
🇸🇮Slovenia slashrsm
Still work in progress, but the idexing side of things work.
Merge request !64#3472769: Add support for neural search. → (Open) created by slashrsm
Pipeline finished with Failed
10 months ago
Total: 385s
#276623
Status changed to Needs work 10 months ago12:39am 9 September 2024
Comment 10 months ago →
🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia
Thanks Janez. Looks like a great start. Besides the linting errors, I can see there are still a lot of hard coded values, and we're missing tests. I know it's a complex area, but it would be good to have some basic docs on setup and links for further info.
Comment 10 months ago →
🇦🇺Australia acbramley
Quick n dirty review for now as this is obviously still WIP - great to see movement in this area though!
Pipeline finished with Failed
9 months ago
Total: 352s
#283755
Pipeline finished with Failed
9 months ago
Total: 354s
#283765
Pipeline finished with Failed
9 months ago
Total: 223s
#283871
Pipeline finished with Success
9 months ago
Total: 214s
#283874
Pipeline finished with Success
9 months ago
Total: 413s
#283881
Pipeline finished with Canceled
9 months ago
Total: 93s
#283885
Pipeline finished with Failed
9 months ago
Total: 1479s
#283886
Pipeline finished with Success
9 months ago
Total: 263s
#283896
Comment 9 months ago →
🇸🇮Slovenia slashrsm
Pipeline finished with Success
9 months ago
Total: 260s
#284387
Comment 9 months ago →
🇸🇮Slovenia slashrsm
slashrsm → changed the visibility of the branch query_side to hidden.
Comment 9 months ago →
🇸🇮Slovenia slashrsm
After watching Driesnote and looking into AI module → a bit I realized that we are basically re-implementing their provider plugins here. In order to avoid that I decided to depend on the AI module for providers. Updated MR assumes/uses ✨ Provide embeddings vector size Active , which add vector size function that we rely on.
Pipeline finished with Failed
9 months ago
Total: 225s
#296983
Pipeline finished with Failed
9 months ago
Total: 222s
#297007
Comment 4 months ago →
🇦🇹Austria maximilianmikus
I was looking into adding OpenSearch as a vector database provider and I found this issue by chance. I was wondering if it wouldn't be better to put this functionality in its own provider module? I started a project just for that before I found this issue by chance.
Comment 11 days ago →
🇺🇸United States damienmckenna NH, USA
FYI the separate provider module has been deprecated in favor of this issue, though the current MR doesn't apply against the 3.x branch.
Comment 9 days ago →
🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia
A recommended approach for vector indexing is an ingest pipeline. I wonder if this issue could be expanded to include support for that?
Merge request !107[#3472769] Add support for vector indexes → (Open) created by kim.pepper
Comment 5 days ago →
🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia
Started work on a more integrated approach. At this stage all the MR does is set index.knn = TRUE when creating the index.

In order to have knn enabled on an index, we need to set that option when creating the index. We can change it after.

This meant we needed to refactor the addIndex() method to not create then update settings, but to pass the settings at creation time. This refactoring could potentially be split out into a separate issue.
Comment 3 days ago →
🇦🇺Australia kim.pepper 🏄‍♂️🇦🇺Sydney, Australia
Ran into a bit of an issue with the pipelines. In order to have Opensearch generate the text embeddings, you need to specify text field to embedding field mappings when creating the pipeline. I don't think it would be easy to dynamically create a pipeline like this with search api.

I'm going to check out the https://www.drupal.org/project/ai_vdb_provider_opensearch → module to see if the built-in AI Search would work.

Add support for neural search (text embeddings)

Problem/Motivation

Proposed resolution

Remaining tasks

Merge Requests

!107Add support for neural search (text embeddings)
Open

!64Add support for neural search (text embeddings)
Open

Comments & Activities

Add support for neural search (text embeddings)

Problem/Motivation

Proposed resolution

Remaining tasks

Merge Requests

!107Add support for neural search (text embeddings)Open

!64Add support for neural search (text embeddings)Open

Comments & Activities

!107Add support for neural search (text embeddings)
Open

!64Add support for neural search (text embeddings)
Open