Add a way to run LLM regression test

Created on 25 November 2024, 27 days ago

Problem/Motivation

We have the possibility to test all the services/methods that agents uses using normal testing, but there is no way to see if changes to the prompts causes changes to a wide range of different type of prompts.

We need someway to run a list of prompts with a list of expected results against a real provider to check if prompt changes did affect any type of prompt.

Proposed resolution

Look at the module AI Agents Form Integration and the FieldTypeCreationForm. It can extract CSV and then run the prompt one after another.
Add a listener to the posteven to check the json against the wanted json.

Remaining tasks

User interface changes

API changes

Data model changes

Feature request
Status

Active

Version

1.0

Component

Code

Created by

🇩🇪Germany marcus_johansson

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024