Search API Index fields need more attributes

Created on 3 September 2024, 5 months ago

Problem/Motivation

This is not purely vector search issue. Any external search database used in Views needs to duplicate fields content, if that content should be used in filtering or sorting. One such example is 'state', for filtering published/archived entities - that field has no semantic meaning, but it needs to be included in the external database. Including such fields in vector representation of entities can cause problems, especially if the content is short.

The second issue is that some vector databases have hard limit on storage of additional fields (e.g. Milvus has it set to 64kB), and if they do not provide keyword search, storage all the fields used for vectorisation is superfluous.

The third issue is that user needs better control of usage of fields for vectorisation. Now, we do assume that short fields are metadata, but that will not always be the case.

Proposed resolution

Additionally to the selection of the type of a field (text/number/...) we need to record additional attributes of that field:

  • Inculde in vector (text radios - Metadata/Content/Exclude)
  • Include as field (boolean checkbox)
📌 Task
Status

Active

Version

1.0

Component

AI Search

Created by

🇬🇧United Kingdom seogow

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024