Form the data structure for AI events with main and optional fields

Issue created by @murz
Comment 4 months ago →
🇩🇪Germany marcus_johansson
Just so you don't miss it - all that should already be there abstracted in 1.2.x if you use the input and output interfaces + the provider data from the events. For that specific part there should be no need at least for chat to do any extra work. All providers have the possibility to give back how much tokens they use.

Included could also be:
* Reasoning token usage
* Cached token usage

See: https://git.drupalcode.org/project/ai/-/blob/1.2.x/src/OperationType/Cha...

If there are other mappings that are not provider specific, we should add those.

For the operation types, they are available via:

\Drupal::service('ai.provider')->getOperationTypes();
You should be able to get readable labels as well, if it should be presented nicely, instead of the data name.

🇦🇲Armenia murz Yerevan, Armenia

So, here is an example of what we have now in the log entry - full data fro the AI provider completions (chat) response event:

tags:
  - chat
  - chat_generation
  - ai_api_explorer
input:
  messages:
    -
      role: user
      text: hi
      tools: null
      images: {  }
      tool_id: null
  chat_tools: null
  debug_data: {  }
  chat_strict_schema: false
  chat_structured_json_schema: {  }
model: gpt-4o
output:
  metadata: {  }
  rawOutput:
    id: chatcmpl-BxHUfVPZeSb5qK0MgR2aFWDHjiou1
    model: gpt-4o-2024-08-06
    usage:
      total_tokens: 17
      prompt_tokens: 8
      completion_tokens: 9
      prompt_tokens_details:
        audio_tokens: 0
        cached_tokens: 0
      completion_tokens_details:
        audio_tokens: 0
        reasoning_tokens: 0
        accepted_prediction_tokens: 0
        rejected_prediction_tokens: 0
    object: chat.completion
    choices:
      -
        index: 0
        message:
          role: assistant
          content: 'Hello! How can I assist you today?'
        logprobs: null
        finish_reason: stop
    created: 1753468297
    system_fingerprint: fp_a288987b44
  normalized:
    role: assistant
    text: 'Hello! How can I assist you today?'
    tools: null
    images: {  }
    tool_id: null
  tokenUsage:
    input: null
    total: 2202
    cached: null
    output: null
    reasoning: null
provider: openai
debugData: {  }
eventName: ai.post_generate_response
configuration:
  top_p: 1
  max_tokens: 4096
  temperature: 1
  presence_penalty: 0
  frequency_penalty: 0
operationType: chat
providerRequestId: 4b2ace6f-eed3-4b8d-bf4b-5fd0aeae092a
providerRequestParentId: null

So, do we want to store everything from there in the logs? Or just specific fields?

I think we can prepare pre-defined bundles of data to store in log, something like:
- Minimal (only provider name, model name, request id, and total token usage)
- Average (+ detailed token usage and some most meaningful fields, maybe the summary of input and output - number of messages, text length, etc)
- Detailed (+ input and output texts, with the ability to trim the text)
- Full (everything excluding debug)
- Debug (everything + debug)

Form the data structure for AI events with main and optional fields

Problem/Motivation

Proposed resolution

Remaining tasks

Comments & Activities