Create visible testing framework

Created on 15 May 2025, 3 months ago

Problem/Motivation

From Slack:

Things we need it to do: - Our new version, we don't need to actually write functional tests or unique tests to check if the agents have worked (Like an actual taxonomy term has been created). WE just need to check the tool was called and sometimes with specific parameters. This can all be created, written and maybe even run in a UI to help with Prompt Engineer and testing different models.
Below is changes to the above features (We should put it in an issue)
Drupal CMS Core Agents Tests - Chat one
Create a Test with a name and description
Write out the Version of Drupal you are testing (Drupal CMS? Module version numbers) - Can be obtained automatically from the environment
Can be enabled or disabled.
Every Assistant Chat message - test if it picks the correct Agent
We need a UI for creating a collection of chats (so there will be a history) and which agent it should find.
UI for selecting the "Agent Test" we want to add to the chat message.
Similar thing to the General Agents tests, but the input is created by an LLM not a model prompt.
Test does it call the correct tools.
Tests need to be nested.
In the Report, we need to say what models were used
General Agents Tests: - Prompt Engineering Tool
What to do
Create a Test with and name and description.
Write out the Version of Drupal you are testing (Drupal CMS? Module version numbers) - Can be obtained automatically from the environment. The gui can pick which model and provider
Pick an agent to test.
Insert model prompt (Which contains all the context it needs)
Select the Tools it should pick.
Select the Order the tools are picked. (Optional whether order is tested)
Tools can be picked more than once
Select the parameters it should pass to the tools where appropriate (Optional)
Success is the correct is picked with correct parameters.
Some way of re-running the tests and undoing everything it did
"We then need some concept of "History" that isn't the same as chat history." - We need its version of "History"
Notes:
These have to be tests that run in a real full environment, not unit tests
We also want them to be run as kernel tests
These can be enabled or disabled.
We can create a UI in Agent Explorer, where I can pick a specific test.
We could make a UI in AGent Explorer, so that when I run something, I can click button and it will create a test with all the parameters filled out.
Also record the time it took.
Complex Agents Test
The same as general agents but they can call other agents as tools which can call other agents as tools.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

📌 Task

Status

Active

Version

1.1

Component

Code

Created by

🇩🇪Germany marcus_johansson

Live updates comments and jobs are added and updated live.

Merge Requests

!128Create visible testing framework
Merged
🇩🇪Germany marcus_johansson
updated 2 months ago

Comments & Activities

Issue created by @marcus_johansson
Merge request !128Resolve #3524725 "Create visible testing" → (Merged) created by marcus_johansson
Comment 3 months ago →
🇩🇪Germany marcus_johansson
Comment 2 months ago →
🇩🇪Germany marcus_johansson
Comment 2 months ago →
🇩🇪Germany marcus_johansson
The testing framework is available under https://www.drupal.org/project/ai_agents_test → , this issue will be to rectify all changes made to the agents and the tools to improve the Drupal CMS experience.
Comment 2 months ago →
System Message

marcus_johansson → committed 3b3313b1 on 1.1.x
Resolve #3524725 "Create visible testing"
Comment 2 months ago →
🇩🇪Germany marcus_johansson
Comment about 2 months ago →
System Message
Automatically closed - issue fixed for 2 weeks with no activity.
Status changed to Fixed 6 days ago9:02pm 29 July 2025
Comment 6 days ago →
🇺🇸United States Kristen Pol Santa Cruz, CA, USA
unassigning

contrib.social Blog FAQ Discussions

Production build 0.71.5 2024