Moderation calls on by default

Created on 13 June 2024, about 1 year ago

Problem/Motivation

When connecting to an LLM, it is possible to have content moderation on or off. Many developers new to AI will not realise the importance of content moderation or its purpose and might just leave it off because they can. It may result in their account being permanently banned.

However different LLM providers have different approaches to content moderation. Some use a special moderation call, some use their own LLMs that are sort of set into "Moderation mode". Some are free and some cost money and so there may me many legitimate reasons why these are turned off (especially if you are batch processing pre-screen information).

Proposed resolution

A warning will be provided in the LLM configuration page. By default it will always turn on moderation calls where it can.

Description of behaviour:

  • If the LLM provides no content moderation, it won't be used.
  • If the LLM provider provides content moderation for free, it will be used. On the configuration it will say it has turned on by default and will have instructions on how to disable it (But this will have to be done in code)
  • If the LLM provider provides content moderation but it costs money it will be enabled by default, however in advanced settings it will be possible to disable it, a disclaimer has to be clicked after 10 seconds to save it. This means it is possible via the UI but difficult.

User interface changes

Proposed Copy:
Under Choose an Available key:
Moderation Call: Prior to sending a prompt to the LLM sometimes it is sent to a moderation call first that checks if this prompt is safe for the LLM. If no, it will not send this. This is very important as organisations can sometimes be liable for the prompts sent by their users. Without this you may be BANNED PERMANTLY from accessing the AI API. Many developers have accidently got their accounts banned and to prevent this we have turned on moderation calls by default.
Moderation calls prevent potentially harmful or offensive prompts as well as some protection against prompt injection.
--------------
This LLM provider has NO moderation call. Your users will be able to send it ANY prompts they want.
-------
This LLM provider has FREE moderation calls. This is turned ON by default. To turn it off, you can change it in code HERE.
Note: If you turn this off you risk getting your account BANNED if you do not pre-screen it yourself.
------
This LLM provider has PAID FOR moderation calls. This is turned ON by default. To turn it off, you can do so in the advanced settings.
----------
Under Advanced Settings:
Default: Enabled.
I Understand:
I understand that disabling the moderation call, whilst it may save me money, might get me PERMANENTLY BANNED.

✨ Feature request
Status

Active

Version

1.0

Component

Code

Created by

πŸ‡¬πŸ‡§United Kingdom yautja_cetanu

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024