- Issue created by @marcus_johansson
- ๐ฉ๐ชGermany breidert
Additional information from weekly meeting:
The UI/UX has to make it simple to create and manage guardrails. However, since there are many things to configure, this could become complex.
Guardrails can be general and apply to general functionality such as AI Translation or Content Suggestions, where you might just need to block things like PII data or flag things like
<script>
tags.Guardrails can also be very specific and be tied to a single agent or a tool.
A specific example of a guardrail for a single agent might be something like:
โCheck that the text in the image only includes cooking instructionsโnothing else.โ
Youโd only want that guardrail running for the agent that generates food recipesโnot every AI process.A UI/UX should work for all use cases.
- ๐ฌ๐งUnited Kingdom yautja_cetanu
"it can immediately raise an error, which stops the expensive model from running and saves you time/money."
From the OpenAI Agents SDK, I can't see if Guardrails HAVE to stop the execution or CAN cut the operation. I think it should be the "default" approach for guardrails but not the only. Instead it should be possible (even if not possible with version 1.1 to)
- Have guardrails stop the execution.
- Have guardrails go back to the agent to give them another go at it.
- Have guardrails trigger some kind of end-user action that would allow the execution to continue where its left off. Terminate it, or start down a new path.
I don't think we should build all the above options above. But I think we should think about it as a possibility.
- First commit to issue fork.
- ๐ฎ๐นItaly lussoluca Italy
lussoluca โ changed the visibility of the branch 3518963-meta-create-the to hidden.
- ๐ฎ๐นItaly lussoluca Italy
lussoluca โ changed the visibility of the branch 3518963-meta-create-the to active.
- ๐ฎ๐นItaly lussoluca Italy
lussoluca โ changed the visibility of the branch 3518963-meta-create-the to hidden.
- ๐ฎ๐นItaly lussoluca Italy
Sorry, I wanted to push an initial stub, but then I realized that this issue was opened into
ai_agents
module (I worked on theai
module...)I think that guardrails are a generic concept that can be applied to every interaction with an LLM, not only when using agents. Maybe we should move this issue to the
ai
project? - ๐ฉ๐ชGermany marcus_johansson
You are right Luca - we will need an issue here as well, for actual UI implementation, but that is dependent on the other issue. Will move.
- ๐ฎ๐นItaly lussoluca Italy
lussoluca โ changed the visibility of the branch 3518963-meta-create-the to active.
- ๐ฎ๐นItaly lussoluca Italy
lussoluca โ changed the visibility of the branch 3518963-meta-create-the to hidden.
- ๐ฉ๐ชGermany breidert
Since guardrails can appear in many places such as automators, agents, tools, content suggestions, CKEditor, etc. we need a list of configuration UIs where the functionality can be added. Best would be with screenshots as we might need UX work to create a pleasant configuration experience.
- ๐บ๐ธUnited States Kristen Pol Santa Cruz, CA, USA
Switching to the correct tag
- ๐ฎ๐นItaly lussoluca Italy
Since guardrails can appear in many places such as automators, agents, tools, content suggestions, CKEditor, etc. we need a list of configuration UIs where the functionality can be added. Best would be with screenshots as we might need UX work to create a pleasant configuration experience.
I agree with you, maybe we should open a separate issue for the UI part?
- @lussoluca opened merge request.
- ๐ฎ๐นItaly lussoluca Italy
A new Plugin type named
AiGuardrail
is added to define guardrails implementation. A guardrail plugin should implementConfigurableInterface
andPluginFormInterface
from Core (to make the plugin configurable by a form). A new interface is created to mark plugins that need access to theAiPluginManager
service:NeedsAiPluginManagerInterface
.
An
AiGuardrail
plugin must implement the
processInput
method that takes aChatInput
as input and returns aGuardrailResultInterface
. The AI module provides some implementation for the
GuardrailResultInterface
:
PassResult
: indicates the input can pass without changesStopResult
: indicates the input should not be processed further (and a standard message has to be sent to the user)RewriteInputResult
: indicates the input should be rewritten (maybe to remove some PII)RewriteOutputResult
: indicates the output should be rewritten (maybe to remove unwanted information from a LLM response)
A new configuration entity named
ai_guardrail
is added, including an ID, label, description, theAiGuardrail
plugin to use, and the plugin settings.
AnAiGuardrailForm
class is provided to render anAiGuardrail
plugin form and to save the results as anAiGuardrail
entity.
A single guardrail is usually insufficient to protect a conversation between users and an LLM. We want some guardrails to run on the chat input and some others to run on the LLM response, before sending the text to the user. To represent this standard behaviour, a newai_guardrail_set
configuration entity is added, including an ID, label, description, a list of guardrails plugins that must be run before sending the prompt to an LLM (pre_generate_guardrails
), and a list of guardrails plugins that must be run after a response from an LLM is received (post_generate_guardrails
). AnAiGuardrailSetForm
class is provided to create and configure a guardrail set with a UI.
Guardrails checks run in an event subscriber, configured to listen toai.pre_generate_response
andai.post_generate_response
events.
An initial integration with the AI module is provided in thechat_generator
AiApiExplorer
plugin: a new select is added to the Advanced accordion that can be used to choose whichai_guardrail_set
to use.See the attached screen recording for a demo of the UI.
- Assigned to lussoluca
- Status changed to Needs review
14 days ago 7:27am 1 September 2025 - ๐ฎ๐นItaly lussoluca Italy
First POC is ready to be reviewed.
- This MR adds the guardrail and guardrail_set concepts
- This MR adds guardrail support for agents
- This project โ implements an initial set of guardrails (I've separated the code for guardrails to ease dependency management)