Add some kind of quota management system

Issue created by @aspilicious
Comment 5 months ago →
🇩🇪Germany marcus_johansson
So the PostGenerateResponseEvent ans PostStreamingResponseEvent should kind of cover this, if you check https://project.pages.drupalcode.org/ai/developers/events/#example-2-pos... and https://project.pages.drupalcode.org/ai/developers/events/#example-3-str.... Meaning this could even be an external module. Both expose metadata, where most of the time token usage resides, when available.

There would however be some problems I can see right away with this:

API Keys are not on purpose available there. You would also need to listen to PreGenerateResponseEvent and use requestThreadId to try to connect it with the PostGenerateResponseEvent. The PreGenerateResponseEvent have authentication mechanisms available.

API Keys is not something generic that exists on all providers - also the PreGenerateResponseEvent is made for any logic where you might want to "load balance" requests to different endpoints, api keys etc. - so a solution would need a narrow scope.

This also touches on metadata - we have not normalized this at the moment. For 2.0.0 release something we want to look into is normalize normal configuration like temperature and normalized normal metadata like input and output tokens.

But a limited solution that works for instance with OpenAI, Gemini, Anthropic, Fireworks and some of the "easy-to-setup" services would be possible to create for sure.
Comment 5 months ago →
🇧🇪Belgium aspilicious
@marcus thank you, this answer is really helpfull. As we probably can't fix this for all providers, we probably will create a seperate module.

How would you block the requests when the limits are reached?
Comment 5 months ago →
🇬🇧United Kingdom MrDaleSmith
I believe the easiest way would be to create a custom exception and throw it, allowing other code to react to the event if it needs to.
Comment 5 months ago →
🇩🇪Germany marcus_johansson
@aspilicious - as Paul writes, an exception will work to stop the call, you can see how its done in the AI External Moderation here: https://git.drupalcode.org/project/ai/-/blob/1.0.x/modules/ai_external_m...
Comment 3 months ago →
🇩🇪Germany marcus_johansson
First step should be coming in 1.1.x, see ✨ Abstract token usage Active .

In think the full implementation that stips something needs to go into an external module. But this would at least provide a framework for it.

Add some kind of quota management system

Comments & Activities