Non-UTF8 encoding results in the exception "Failed to insert into collection"

Created on 21 October 2024, 2 months ago

When set the ai_search as the screenshot below:
,
I click on "Index now" button, it errored, and in the recent log messages, there is:

Failed to determine non-UTF8 encoding to attempt to auto-convert chunk: # 君强 君强 �组织
Hence the other error:
Exception: Failed to insert into collection: can only accept json format request, the request body should be nil, however {} is valid in Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->insertIntoCollection() (line 402 of /var/www/html/drupal/web/modules/contrib/ai/modules/vdb_providers/vdb_provider_milvus/src/Plugin/VdbProvider/MilvusProvider.php)

Maybe the character � is the culprit. I read the codes in the file modules/ai_search/src/Plugin/EmbeddingStrategy/EmbeddingBase.php, there are 2 lines of continue, so it should bypass the encoding failures, why resulted a whole index failure?

🐛 Bug report
Status

Active

Version

1.0

Component

AI Search

Created by

🇨🇳China fishfree

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024