Improve the AI Search recursive retrieval of a specific quantity of results

Created on 24 May 2025, 7 days ago

Problem/Motivation

At the moment if you follow the code in Drupal\ai_search\Plugin\search_api\backend\SearchApiAiSearchBackend::$maxAccessRetries, we re-attempt up to 10 times to get a specific count (limit) of results.

For some scenarios like AI Related Content β†’ in a context where large Nodes have been broken into many smaller chunks, even this iteration may not be sufficient, especially if no filter on access is made and subsequent access checks also exclude many nodes (e.g. a more member content driven site).

Steps to reproduce

  1. Have a site with many access controlled nodes
  2. Have large content lengths with small chunk size
  3. Attempt to search and retrieve a specific number of results

Proposed resolution

Improve the iteration by allowing Vector Databases to say they are either:

  1. A vector database that supports Grouping or Aggregation of some form like https://milvus.io/docs/grouping-search.md. We can group by drupal_long_id. This seems to just be Milvus (big win for Milvus!)
  2. A vector database that supports filtering by NOT IN array of already found drupal_long_id. Most VDB Providers (if not all) should be able to support this).

So I think some more changes to VDB Provider interfaces probably. For (1) its pre-query change, for (2) its post query condition setting by VDB Provider on recursive ::doSearch() call

Remaining tasks

  1. Merge request to build in this functionality
  2. Decide when to implement, as it will probably be a breaking change and require coordinated release of VDB providers. I suggest 2.0.x

User interface changes

N/A

API changes

TBD

Data model changes

N/A

✨ Feature request
Status

Active

Version

2.0

Component

AI Search

Created by

πŸ‡¬πŸ‡§United Kingdom scott_euser

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024