Add search item id to tags

Created on 28 March 2025, 2 months ago

Problem/Motivation

In favor of https://www.drupal.org/node/3516044 β†’ a change should be made to the tags that are sent to the moderation provider so we have a reference to the item that has been indexed.

✨ Feature request
Status

Active

Version

1.1

Component

AI Search

Created by

πŸ‡§πŸ‡ͺBelgium jonas139

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jonas139
  • Merge request !535#3516046: Add extra tags for moderation β†’ (Open) created by JeroenT
  • Pipeline finished with Failed
    2 months ago
    #459818
  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    Only I wonder if we should do this without changing the method in case someone is extending... e.g. loop through raw embeddings to add

  • πŸ‡§πŸ‡ͺBelgium jonas139

    This was the easiest for me at the moment but I do reckon it's not the best solution. Do you have any other suggestions?

  • πŸ‡¬πŸ‡§United Kingdom MrDaleSmith
  • Pipeline finished with Success
    2 months ago
    #463474
  • First commit to issue fork.
  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    @scott_euser - I made a version that switches to just setting the index object as an object variable and setting the tag if its set. Not the nicest solution, but it doesn't break the base class.

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    Hmmm I don't quite understand the purpose of this, I think the issue summary needs more detail perhaps? In Embeddingbase::getEmbedding() we already have the Search API Item ID attached to each embedding. What value does attaching it to the raw embedding give us?

    Is the reasoning so that when the LLM is generating the embedding it knows the source to be able to bail on generating the raw embeddings in the first place if the content seems inappropriate/malicious?

    Given AI Search is still experimental if we match core change policy, we can change things on patch releases: https://www.drupal.org/about/core/policies/core-change-policies/experime... β†’ not sure how often people would actually extend embedding strategy base, but in any case if they do, however we make the change, they may need to update to fill in search API item, so might as well pass it as an argument to to getRawEmbeddings()? I would guess 99% of sites or maybe 99.9% of sites don't extend these, but open to other opinions...

  • πŸ‡©πŸ‡ͺGermany marcus_johansson

    I looked into the parent issue and the reason as far as I can see from the other issue is this:

    Someone is trying to embed some piece of content and it fails the Embeddings call due to moderation api. Right now in OpenAI module this is hardcoded to do moderation checkups if you have moderation enabled. When this fails, that tag is to be forwarded into the moderation call so this can be logged somehow for editors to check where its failing to embed.

    For the parent issue I have added a suggestion on how we should solve it first ✨ Moderation log overview Active , but it would make sense to have the index as a tag here, so you can get it logged (or take other action on it).

    If we can keep it as it was before, then that is an easier solution and we should revert back to @jonas139's code.

  • Issue was unassigned.
  • Status changed to Needs review 6 days ago
  • πŸ‡ͺπŸ‡ΈSpain gxleano CΓ‘ceres

    Someone is trying to embed some piece of content and it fails the Embeddings call due to moderation api. Right now in OpenAI module this is hardcoded to do moderation checkups if you have moderation enabled. When this fails, that tag is to be forwarded into the moderation call so this can be logged somehow for editors to check where its failing to embed.

    Could we consider that this is going to be handled by https://www.drupal.org/project/ai/issues/3526710 πŸ› [Error] The Prompt is unsafe: The prompt was flagged by the moderation model, stop the indexation Active ?

  • πŸ‡¬πŸ‡§United Kingdom scott_euser

    I didn't quite follow the parent issue, but I think we are roughly agreed we can close this now that we have πŸ› [Error] The Prompt is unsafe: The prompt was flagged by the moderation model, stop the indexation Active in? Thanks all!

Production build 0.71.5 2024