Change how users select `tokenizer chat model` on AI Search / Search API server

Comments & Activities

Issue created by @jackbravo
Comment 7 months ago →
🇬🇧United Kingdom scott_euser
Thanks for raising! Seems like there are actually a couple issues here:

Fix the current fallback

Improve the search api setup experience when using non-OpenAI provider to generate embeddings

Fix the current fallback

5. Index, you'll get the next error:

InvalidArgumentException: Unknown model name: llama3.1:latest in Yethee\Tiktoken\EncoderProvider->getForModel() (line 123 of /var/www/html/vendor/yethee/tiktoken/src/EncoderProvider.php).

It sounds like this is not catching it? https://git.drupalcode.org/project/ai/-/blob/1.0.x/src/Utility/Tokenizer... Ie, do we need to change that to \InvalidArgumentException from \Exception?

Improve the search api setup experience when using non-OpenAI provider to generate embeddings

Sounds like the options returned here from getSupportedModels() https://git.drupalcode.org/project/ai/-/blob/1.0.x/src/Utility/Tokenizer... are not very helpful?

This is what I see in the configuration form for example:

What do you see when you use non-OpenAI for embeddings?

To note here is the full list of current options: https://github.com/yethee/tiktoken-php/blob/master/src/EncoderProvider.p...

On to your suggestions so I can understand them better
Comment 7 months ago →
🇬🇧United Kingdom scott_euser
Comment 6 months ago →
🇬🇧United Kingdom scott_euser
Actually this has progress quite a bit on latest dev and most cases auto-selection is done. Going to mark as fixed to give credit for the thinking that went into this which helped inform later UX decisions. Thank you!
Comment 6 months ago →
System Message
Automatically closed - issue fixed for 2 weeks with no activity.

Change how users select `tokenizer chat model` on AI Search / Search API server

Problem/Motivation

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Comments & Activities

Fix the current fallback

Improve the search api setup experience when using non-OpenAI provider to generate embeddings

On to your suggestions so I can understand them better