Add a way to run LLM regression test

Created on 25 November 2024, 7 months ago

Problem/Motivation

We have the possibility to test all the services/methods that agents uses using normal testing, but there is no way to see if changes to the prompts causes changes to a wide range of different type of prompts.

We need someway to run a list of prompts with a list of expected results against a real provider to check if prompt changes did affect any type of prompt.

Proposed resolution

Look at the module AI Agents Form Integration and the FieldTypeCreationForm. It can extract CSV and then run the prompt one after another.
Add a listener to the posteven to check the json against the wanted json.