- Issue created by @murz
- 🇩🇪Germany marcus_johansson
Just so you don't miss it - all that should already be there abstracted in 1.2.x if you use the input and output interfaces + the provider data from the events. For that specific part there should be no need at least for chat to do any extra work. All providers have the possibility to give back how much tokens they use.
Included could also be:
* Reasoning token usage
* Cached token usageSee: https://git.drupalcode.org/project/ai/-/blob/1.2.x/src/OperationType/Cha...
If there are other mappings that are not provider specific, we should add those.
For the operation types, they are available via:
\Drupal::service('ai.provider')->getOperationTypes();
You should be able to get readable labels as well, if it should be presented nicely, instead of the data name.
- 🇦🇲Armenia murz Yerevan, Armenia
So, here is an example of what we have now in the log entry - full data fro the AI provider completions (chat) response event:
tags: - chat - chat_generation - ai_api_explorer input: messages: - role: user text: hi tools: null images: { } tool_id: null chat_tools: null debug_data: { } chat_strict_schema: false chat_structured_json_schema: { } model: gpt-4o output: metadata: { } rawOutput: id: chatcmpl-BxHUfVPZeSb5qK0MgR2aFWDHjiou1 model: gpt-4o-2024-08-06 usage: total_tokens: 17 prompt_tokens: 8 completion_tokens: 9 prompt_tokens_details: audio_tokens: 0 cached_tokens: 0 completion_tokens_details: audio_tokens: 0 reasoning_tokens: 0 accepted_prediction_tokens: 0 rejected_prediction_tokens: 0 object: chat.completion choices: - index: 0 message: role: assistant content: 'Hello! How can I assist you today?' logprobs: null finish_reason: stop created: 1753468297 system_fingerprint: fp_a288987b44 normalized: role: assistant text: 'Hello! How can I assist you today?' tools: null images: { } tool_id: null tokenUsage: input: null total: 2202 cached: null output: null reasoning: null provider: openai debugData: { } eventName: ai.post_generate_response configuration: top_p: 1 max_tokens: 4096 temperature: 1 presence_penalty: 0 frequency_penalty: 0 operationType: chat providerRequestId: 4b2ace6f-eed3-4b8d-bf4b-5fd0aeae092a providerRequestParentId: null
So, do we want to store everything from there in the logs? Or just specific fields?
I think we can prepare pre-defined bundles of data to store in log, something like:
- Minimal (only provider name, model name, request id, and total token usage)
- Average (+ detailed token usage and some most meaningful fields, maybe the summary of input and output - number of messages, text length, etc)
- Detailed (+ input and output texts, with the ability to trim the text)
- Full (everything excluding debug)
- Debug (everything + debug)