- Issue created by @ressa
- Status changed to Needs review
6 months ago 2:43pm 12 June 2024 - 🇩🇰Denmark ressa Copenhagen
Bonus question: Does anyone have any experience with getting fairly fast and short answers (max. 500 characters) from HuggingFace, but getting longer processing time and thereby answers, after upgrading to a paying account?
HuggingFace answers after only 5 seconds with a short reply, whereas OpenAI, which I used previously, spent up to 30 seconds, returning very elaborate replies with up to 4000 character ...
- 🇩🇪Germany marcus_johansson
Thanks, I assume this is already tested by yourself, so I'll set it to fixed. It should be visible in the DEV version.
Regarding you question, do you mean that the Huggingface Pro account, versus normal Huggingface account? If that is the case I wouldn't know. I don't see any difference between them.
If you mean that you are actually paying for a machine via dedicated inference, then the speed depends on the machine size.
- Status changed to Fixed
5 months ago 12:57pm 15 June 2024 - 🇩🇰Denmark ressa Copenhagen
Thanks for a fast reply @Marcus_Johansson.
So I guess there is no way to turn up a time parameter, max. characters ... I assumed that a free account gets for example 5 seconds execution, and 500 characters max., whereas paying customers could in theory be allowed to set a max. execution time and amount of text, for example 30 seconds and 5000 characters.
But since you write "I don't see any difference between them.", so maybe it's not possible. It's just odd, because locally Ollama and Llama3 has no problem crunching a prompt for 15-25 seconds and returning ~5000 characters, but the same model on HuggingFace only thinks for ~5 seconds and returns ~500 characters.
Automatically closed - issue fixed for 2 weeks with no activity.