Filtering on Vocabulary tags does not work for all tags

Created on 29 January 2025, 2 months ago

Problem/Motivation

I can't get consistent results filter my search based on Taxonomy tags. Some tags will not work and just act to remove the content from the results set.

Steps to reproduce

My search servers is a Milvus DB which is a free account hosted on Zilliz

I have an AI Search Index that is currently indexing 34 pieces of content from two content types.

Both content types have a body field and use the existing field_tags referencing the Tags Taxonomy, There are
19 tags, each piece of content can have multiple tags, most only have a single tag. Content moderation has
been disabled on my site. The Content fields indexed are Body and Tags, the Tags are identified as 'filterable attributes'
and type Integer.

I have a search view that is using the index, it has a full-text search filter and a content filter on the Tags that are exposed to the user, the filter is a block that is displayed on my home page.

Consistently whilst I have been developing this random tags (I have not identified a pattern) just don't work. Last index rebuild two tags did not work. To test I select a tag in the filter and leave the search box empty, two of the tags return no results after the last index (no results are returned when that tag is selected), If I set the filter to 'Any' and use the text search with something very specific to return a piece of content that is tagged with one of the none working tags I get the expected result, if I then set the filter to that tag that is on the content no results are returned.

It feels random because I have had issues with different tags. Sometimes (not always) I can 'fix' the tag by editing and saving the relevant content or by adding a working tag to it and then removing that tag.

In all cases, investigating the index in the Zilliz playground the field_tags meta field looks correct, regardless of whether the tag is working on the Drupal front-end or not.

🐛 Bug report
Status

Active

Version

1.0

Component

AI Search

Created by

🇬🇧United Kingdom chris_hall_hu_cheng

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @chris_hall_hu_cheng
  • 🇬🇧United Kingdom scott_euser

    You can try setting different types, eg integer instead of string and see if it helps, but Views Filtering support is in any case very limited as filtering options directly in providers like Milvus/Zilliz are not nearly as feature rich as Views from Drupal Core.

    Moved to Milvus module since filtering support is specific per provider.

    Beyond that you can instead combine with database or solr index and use the Boost with AI Search plugins instead for full Filter support. Downgrading priority given this valid workaround.

  • 🇬🇧United Kingdom tim corkerton

    @scott_euser I can confirm that I am seeing the same kind of random behavior when trying to ai search view using exposed taxonomy terms.
    I have added the taxonomy into the search api field list and set "Filterable attributes" as the Indexing option.

    This issue however is not confined just to Milvus/Zilliz. I have both a Zilliz account and also a Pinecone account. I am seeing identical behavior using Pinecone. it feel like it is an issue with the search ai module. This issue might be better moved to another thread or at least duplicated in the Pinecone provider.

    Scott, I have just watched your video where you discuss the workaround suggested above. https://youtu.be/WZEh4JOGhhM?t=5989 (Great video by the way!) Can you confirm how this approach works? Does it simply prepend results from the ai search to the results of a solr search. If so I don't think that really solves my use case. Any suggestions on how we can help fix this? Can you point to where the exposed filters are handled in the code base?

  • 🇬🇧United Kingdom scott_euser

    Filters still work on top of it, but the filtering needs to be in a View using the Search API Database/SOLR backend, not the AI Search backend.

    It's impossible maintain (or even achieve in the first place) feature parity with Views Filters, so you really need to use Database or SOLR backend in the first place.

    It doesn't prepend, it adds them into the query. Here is a pseudo query example

    SELECY *
    FROM index
    WHERE
    ( keyword in :search_terms OR entity_id in :results_from_ai_search)
    AND status is published
    AND exposed filter is example
    AND etc
    ORDER BY CASE (order from AI Search).., relevance, etc

    Ie, keywords it adds an or by filters are still applied. So if you get 10 results from vector database, still 5 of those might get subsequently excluded by filters. BUT those also might not have been found without vector database if there is semantic meaning match but no actual keyword MATCH (solr has a known bug in the queue so isn't 100% like database yet)

    In any case if you want to use filterable attributes it does need to be VBD provider issue queue as that's where the code sits for the basic (nowhere near feature parity) of filtering per provider. Pinecone filtering in their API is completely different from Milvus/Zilliz. In Milvus Zilliz code is at MilvusProvider::prepareFilters() attempts to convert the query conditions to Milvus Zilliz API documentation (again, very basic though)

Production build 0.71.5 2024