Moderation log overview

Created on 28 March 2025, 2 months ago

Problem/Motivation

When using moderation an error will be thrown when a prompt is flagged.
In issue https://www.drupal.org/project/ai_provider_openai/issues/3507407 Log flagged prompts Active there's already a logging mechanism but the error is still thrown so it's not very user friendly.

It seems better to not throw an error but log the flagged prompt in a separate overview that can be used to track down the related entity (if there is any) so a content manager can review the flagged prompt and rewrite the piece of content.

I am aware this is only needed when using moderation with entities and not when it is used for manual prompting so I think it's up for discussion to include this or not.

Proposed resolution

An option in the config form to enable moderation logging so every flagged prompt will be added to an overview where a content manager can review the specific linked content and adjust the content so it passes moderation. When the content has passed moderation, the log should be removed from the overview (on cron or on save) and if the flag is still present, the log will be updated.

A change in the ai_search is also needed for this approach.

Remaining tasks

- Create an extra option on the config form to enable/disable moderation logging
- Add a new Moderation Log entity
- Create an overview
- Make sure a new log entity is added and managed when a prompt is flagged

User interface changes

An extra overview for moderation logs will be available.

Data model changes

An extra Moderation Log entity will be installed.

Feature request
Status

Active

Version

1.1

Component

Code

Created by

🇧🇪Belgium jonas139

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jonas139
  • Merge request !26#35516044: Add moderation logging → (Open) created by jonas139
  • 🇧🇪Belgium jonas139

    I added an MR for review. It's a basic concept so feedback is welcome.

  • Pipeline finished with Failed
    2 months ago
    #459830
  • Pipeline finished with Failed
    2 months ago
    #463473
  • 🇬🇧United Kingdom MrDaleSmith

    Not a maintainer and this is just my 2p worth, but this seems like a lot of code to reinvent "log an error message" for a handful of people who need it. The exception throwing has been standardised as the AI module's method of telling the code that something has gone wrong and not to continue processing (which might involve more AI calls and so costs) and I'd be reluctant to introduce something non-standard for one area as that would become expensive to maintain and cause confusion for users.

    I'm not sure why standard Drupal logging of the error with more relevant information included won't suit the desired purpose here?

    There are multiple test fails in the code so this can't be merged, but I would maybe wait for the maintainers to chip in about the approach before spending any more time working on it?

  • 🇧🇪Belgium jonas139

    I've used separate entities and a separate overview because for me this shouldn't belong in the general log overview. If you have multiple content managers (who maybe don't have access to the watchdog), they can check this overview and edit the content that belongs to them so, when they save again, the entity is removed from the overview. It's not used as a real 'log' overview in that matter. I do acknowledge the phpstan errors and will look into that if this is still feasible. I just shared this because in my usecase, the client was very happy they had this. But like you said, we'll see what the maintainers have to say.

  • 🇬🇧United Kingdom MrDaleSmith

    Possibly this would make more sense within the ai_logging sub-module: that already has custom AI-related logs, and moderation failures are something that can happen for any provider type (although I believe Open AI is the only one that moderates by default).

  • 🇩🇪Germany marcus_johansson

    Thanks @jonas139 - I think the better solution here, would be that we fix the moderationEndpoints to actually use the moderation abstraction in the OpenAI module. If this is being used, the AI Logging module already takes care of logging this. We created this solution before we had abstracted it, thus it has a hardcoded less then perfect solution - it even has a hardcoded model.

    I don't think we should introduce a new entity just because you install a provider.

    So what would be needed is:

    1. On each moderationEndpoints call, make sure that the tags are forwarded from the original call.
    2. Inside the moderations call, remove all hardcoded solutions and instead normalize the input and do a moderation() call.

    To be able to log embeddings for instance going wrong, you can then turn on AI Logging and only log moderation calls and filter for the tag of the embeddings (or whatever you are looking for).

    This would also clean up old hardcoded code that shouldn't be in there anymore.

    Your thoughts about this?

  • 🇧🇪Belgium jonas139

    Thanks for looking into this @marcus_johansson. The new entity is indeed not the best approach but what I'm trying to achieve is to have an overview with all the failed moderations and which is updated automatically. For instance when a failed moderation has been triggered, it should be kept in the overview until the moderation has been successful so that the user knows when the content is valid. Maybe an overview is not the way to go and should we add a notice on the entity itself but I leave that open for discussion. If you think this is possible with the AI Logging mechanism, I will look into that! Or if you think this is not a feasible feature, I will just close this issue ofcourse.

  • Issue was unassigned.
  • Status changed to Closed: won't fix 3 days ago
  • 🇧🇪Belgium jonas139

    Will close this in favor of https://www.drupal.org/project/ai/issues/3526710 🐛 [Error] The Prompt is unsafe: The prompt was flagged by the moderation model, stop the indexation Active

Production build 0.71.5 2024