- Issue created by @fishfree
- 🇬🇧United Kingdom scott_euser
Possibly related 🐛 Broken Byte-Pair Encoding (BPE) Active
- 🇬🇧United Kingdom scott_euser
Would be good to know if the issue still occurs since we merged 🐛 Broken Byte-Pair Encoding (BPE) Active
When set the ai_search as the screenshot below:
,
I click on "Index now" button, it errored, and in the recent log messages, there is:
Failed to determine non-UTF8 encoding to attempt to auto-convert chunk: # 君强 君强 �组织
Hence the other error:
Exception: Failed to insert into collection: can only accept json format request, the request body should be nil, however {} is valid in Drupal\vdb_provider_milvus\Plugin\VdbProvider\MilvusProvider->insertIntoCollection() (line 402 of /var/www/html/drupal/web/modules/contrib/ai/modules/vdb_providers/vdb_provider_milvus/src/Plugin/VdbProvider/MilvusProvider.php)
Maybe the character � is the culprit. I read the codes in the file modules/ai_search/src/Plugin/EmbeddingStrategy/EmbeddingBase.php, there are 2 lines of continue, so it should bypass the encoding failures, why resulted a whole index failure?
Active
1.0
AI Search
Possibly related 🐛 Broken Byte-Pair Encoding (BPE) Active
Would be good to know if the issue still occurs since we merged 🐛 Broken Byte-Pair Encoding (BPE) Active