- Issue created by @marcus_johansson
- Status changed to Needs review
5 months ago 6:03pm 14 June 2024 - 🇩🇰Denmark ressa Copenhagen
Thanks for adding this feature, it will be really useful to help understand what goes on behind the scene, like which prompt and parameters are being sent (role, max. tokens, and so on).
This feature will probably help answer questions like the one I posed recently?:
Bonus question: Does anyone have any experience with getting fairly fast and short answers (max. 500 characters) from HuggingFace, but getting longer processing time and thereby answers, after upgrading to a paying account?
HuggingFace answers after only 5 seconds with a short reply, whereas OpenAI, which I used previously, spent up to 30 seconds, returning very elaborate replies with up to 4000 character ...
From #3454202-4: Enable token support for AI Interpolator Rule Huggingface Text Generation → .
- 🇬🇧United Kingdom yautja_cetanu
Probably should log:
- Response time
- Tokens
- moderation response
(Maybe Tokens / minute)As you'll want to do tests and then model how long/ expensive something will be when it scales up
- 🇩🇪Germany marcus_johansson
Highlighting this comment to still fix reposnse time.