- Issue created by @fishfree
- π¬π§United Kingdom scott_euser
Here are the steps:
- Create the Search API Server & Index
- Ensure you have added at least one embedding field type in the Index -> Fields
- Index some content
- Create a new View at /admin/structure/views/add
- Under show > change from Content to 'Index {Your index name}'
- Add the Fulltext search field and expose it
- Use the exposed filters, you should get results
- π¨π³China fishfree
@scott_euser Thank you! I tried that before. If changed to the embeddings field type, indexing would fail with error log as below:
Drupal\ai\Exception\AiBadRequestException: Error invoking client: Client error: `POST http://10.2.91.48:11434/api/embed` resulted in a `400 Bad Request` response: {"error":"missing request body"} in Drupal\ai\Plugin\ProviderProxy->wrapperCall() (line 155 of /var/www/html/drupal/web/modules/contrib/ai/src/Plugin/ProviderProxy.php).
- π¬π§United Kingdom scott_euser
I can't see your full stack sltrace so I have no idea what ai provider or vdb provider you are using but I've hit that error only when no embedding field type exists (which is logical, no embedding field = no vectors generated = no vectors to be sent to the vector database. E.g. See https://www.drupal.org/project/ai/issues/3477393 π Search API within AI Search should validate that there is at least one field configured as an Embedding Active which is on my to do list.
- π¬π§United Kingdom scott_euser
I think if you start by providing more details on your setup I can see if I have the same provider and vdb provider available (I'm using ollama and openai fine with milvus/zilliz and pinecone - the latter is still an MR in another issue).
- π¬π§United Kingdom scott_euser
Note that 1.0.x has a significant refactor, please test with latest code and re-open with more details if you still have an issue, otherwise please mark as closed outdated. Thanks!
- π¨π³China fishfree
@scott My detailed log had been flushed. I pull the latest 1.0-dev. Then I change all the index fields from string to embeddings, then re-index, new error occured:
TypeError: Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider::getVdbIds(): Argument #1 ($collection_name) must be of type string, null given, called in /var/www/html/drupal/web/modules/contrib/ai/src/Base/AiVdbProviderClientBase.php on line 323 in Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->getVdbIds() (line 473 of /var/www/html/drupal/web/modules/contrib/ai/modules/vdb_providers/vdb_provider_milvus/src/Plugin/VdbProvider/MilvusProvider.php). #0 /var/www/html/drupal/web/modules/contrib/ai/src/Base/AiVdbProviderClientBase.php(323): Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->getVdbIds() #1 /var/www/html/drupal/web/modules/contrib/ai/src/Base/AiVdbProviderClientBase.php(316): Drupal\ai\Base\AiVdbProviderClientBase->deleteItems() #2 /var/www/html/drupal/web/modules/contrib/ai/src/Base/AiVdbProviderClientBase.php(246): Drupal\ai\Base\AiVdbProviderClientBase->deleteIndexItems() #3 /var/www/html/drupal/web/modules/contrib/ai/modules/ai_search/src/Plugin/search_api/backend/SearchApiAiSearchBackend.php(305): Drupal\ai\Base\AiVdbProviderClientBase->indexItems() #4 /var/www/html/drupal/web/modules/contrib/search_api/src/Entity/Server.php(350): Drupal\ai_search\Plugin\search_api\backend\SearchApiAiSearchBackend->indexItems() #5 /var/www/html/drupal/web/modules/contrib/search_api/src/Entity/Index.php(1008): Drupal\search_api\Entity\Server->indexItems() #6 /var/www/html/drupal/web/modules/contrib/search_api/src/Entity/Index.php(937): Drupal\search_api\Entity\Index->indexSpecificItems() #7 /var/www/html/drupal/web/modules/contrib/search_api/src/IndexBatchHelper.php(160): Drupal\search_api\Entity\Index->indexItems() #8 /var/www/html/drupal/web/core/includes/batch.inc(296): Drupal\search_api\IndexBatchHelper::process() #9 /var/www/html/drupal/web/core/includes/batch.inc(138): _batch_process() #10 /var/www/html/drupal/web/core/includes/batch.inc(94): _batch_do() #11 /var/www/html/drupal/web/core/modules/system/src/Controller/BatchController.php(52): _batch_page() #12 [internal function]: Drupal\system\Controller\BatchController->batchPage() #13 /var/www/html/drupal/web/core/lib/Drupal/Core/EventSubscriber/EarlyRenderingControllerWrapperSubscriber.php(123): call_user_func_array() #14 /var/www/html/drupal/web/core/lib/Drupal/Core/Render/Renderer.php(638): Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() #15 /var/www/html/drupal/web/core/lib/Drupal/Core/EventSubscriber/EarlyRenderingControllerWrapperSubscriber.php(121): Drupal\Core\Render\Renderer->executeInRenderContext() #16 /var/www/html/drupal/web/core/lib/Drupal/Core/EventSubscriber/EarlyRenderingControllerWrapperSubscriber.php(97): Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->wrapControllerExecutionInRenderContext() #17 /var/www/html/drupal/vendor/symfony/http-kernel/HttpKernel.php(181): Drupal\Core\EventSubscriber\EarlyRenderingControllerWrapperSubscriber->Drupal\Core\EventSubscriber\{closure}() #18 /var/www/html/drupal/vendor/symfony/http-kernel/HttpKernel.php(76): Symfony\Component\HttpKernel\HttpKernel->handleRaw() #19 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/Session.php(53): Symfony\Component\HttpKernel\HttpKernel->handle() #20 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/KernelPreHandle.php(48): Drupal\Core\StackMiddleware\Session->handle() #21 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/ContentLength.php(28): Drupal\Core\StackMiddleware\KernelPreHandle->handle() #22 /var/www/html/drupal/web/core/modules/big_pipe/src/StackMiddleware/ContentLength.php(32): Drupal\Core\StackMiddleware\ContentLength->handle() #23 /var/www/html/drupal/web/core/modules/page_cache/src/StackMiddleware/PageCache.php(106): Drupal\big_pipe\StackMiddleware\ContentLength->handle() #24 /var/www/html/drupal/web/core/modules/page_cache/src/StackMiddleware/PageCache.php(85): Drupal\page_cache\StackMiddleware\PageCache->pass() #25 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/ReverseProxyMiddleware.php(48): Drupal\page_cache\StackMiddleware\PageCache->handle() #26 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/NegotiationMiddleware.php(51): Drupal\Core\StackMiddleware\ReverseProxyMiddleware->handle() #27 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/AjaxPageState.php(36): Drupal\Core\StackMiddleware\NegotiationMiddleware->handle() #28 /var/www/html/drupal/web/core/lib/Drupal/Core/StackMiddleware/StackedHttpKernel.php(51): Drupal\Core\StackMiddleware\AjaxPageState->handle() #29 /var/www/html/drupal/web/core/lib/Drupal/Core/DrupalKernel.php(741): Drupal\Core\StackMiddleware\StackedHttpKernel->handle() #30 /var/www/html/drupal/web/index.php(19): Drupal\Core\DrupalKernel->handle() #31 {main}
I have to say I modified the a few lines of the code for being compatible with the new Ollama embeddings provider API:
modules/providers/provider_ollama/src/OllamaControlApi.phpε¦δΈοΌ - $result = json_decode($this->makeRequest("api/embeddings", [], 'POST', [ + $result = json_decode($this->makeRequest("api/embed", [], 'POST', [ - 'prompt' => $text, + 'input' => $text, 'model' => $model, modules/providers/provider_ollama/src/Plugin/AiProvider/OllamaProvider.php: - return new EmbeddingsOutput($response['embedding'], $response, []); + return new EmbeddingsOutput($response['embeddings'][0], $response, []);
- π¬π§United Kingdom scott_euser
Progress then, that's good!
It seems your 'collection' is empty in the configuration at https://git.drupalcode.org/project/ai/-/blob/1.0.x/src/Base/AiVdbProvide... - can you edit your Search API Server and make sure everything is filled in there please? See screenshot:
- π¬π§United Kingdom scott_euser
Actually that's probably solved if you pull latest dev and run updb as we just merged π Update hook for Search AI refactoring Active
- π¨π³China fishfree
@scott Thank you. However for me, after pulling latest dev and running updb, the problem exists the same. I confirmed I set the collection before I post this issue. My Ollama is the latest version.
- π¬π§United Kingdom scott_euser
In EmbeddingBase.php in method getRawEmbeddings() are you able to debug and see if there is content in the chunk being sent? If so problem will be in the provider, in which case perhaps you can try a different provider to confirm.
If the chunks are empty, then perhaps can test by changing `return $chunks;` to `return array_filter($chunks);` at the end of getChunks() method; in that case perhaps something in your content that https://github.com/yethee/tiktoken-php isn't liking...
- π©πͺGermany marcus_johansson
Could you also try the following to make sure that everything works as it should.
1. Enable the AI API Explorer
2. First test that embeddings works by going to /admin/config/ai/explorers/ai-embeddings and using the Embeddings model of your choice and seeing you get an array of numbers back.
3. If that works test the vector db explorer on /admin/config/ai/explorers/vector-db-search and using your index, you should get back chunks. - π¨π³China fishfree
@marcus @scott
Thank you!
In fact I had tried the both explorer before posting here, they worked. I just test them again, still working.
When my config is as below, as what I said in #4 before applying your refactoring, the indexing will work.
If I set them with index options as below, as selecting the embeddings field type before your refactoring, the indexing will fail with the error log as #11.
- π¨π³China fishfree
@Scott @Marcus would you pls help me? I still have no progress.
- π¬π§United Kingdom scott_euser
Could you try the things I suggested in 12? Ie,:
- debug to confirm there is content getting sent at that method (the error message is saying missing body which could mean no content is getting sent)
- or my second suggestion is that there js content getting sent but the provider is the problem: for me all providers I have set up (just 2) don't have that issue
- π¨π³China fishfree
@scott Thank you! I just installed the latest dev version of ai module, and set the index options. I seemed having progress. However, new errors occured:
Failed to determine non-UTF8 encoding to attempt to auto-convert chunk: # εεΌΊ εεΌΊ οΏ½η»η»
Hence the other error:
Exception: Failed to insert into collection: can only accept json format request, the request body should be nil, however {} is valid in Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->insertIntoCollection() (line 402 of /var/www/html/drupal/web/modules/contrib/ai/modules/vdb_providers/vdb_provider_milvus/src/Plugin/VdbProvider/MilvusProvider.php)
I think even there are some failures from UTF8 converting or encoding, we should just ignore that chunck and continue. It seems the current code acts this way, but why still the whole indexing progress failed? Because when I only index one node, it succeeded, the node should have no garbled character like οΏ½.
- π¬π§United Kingdom scott_euser
Hi @fishfree,
I think we might be best moving that to a separate issue. I suppose its possible even if there is a failure to convert to UTF8, a try catch could be added to the generate embedding to attempt it anyways. It's likely that will just shift the error to the embedding error though and still likely mean that the content itself needs to be fixed to be utf8 compatible. Beyond using Drupal Core's UTF8 detecting and repair tools, I am not sure how far we could take it within the module.
In any case it sounds like this particular issue of creating an index and exposing to SOLR is at least sorted for you on your end, and now you have issues within specific content items and UTF8. So assuming that also makes sense to you for new issue, let's close this off as we have diverged from the issue now.
Thanks,
Scott - Status changed to Postponed: needs info
2 months ago 5:53pm 4 February 2025 - π©πͺGermany marcus_johansson
Should this be closed and outdated - its 3 months old?
- π¬π§United Kingdom scott_euser
Yep sounds good. Follow-up re UTF8 could be raised if needed but for now we have from other issues even yest coverage with more exotic characters working fine, so I expect its better now.