- ๐ฒ๐พMalaysia jonloh
Tried the patch, but unfortunately this does not work well in Multi-lingual setup.
- ๐ฎ๐ณIndia Akhil Babu Chengannur
Thanks for the patch. I have created a new patch with few changes.
- When records are splitted, the current patch removes the splitted value from original record and adds it to splitted records. Instead, the new patch will add the first splitted value to original record and subsequent values to split records.
- It adds a new field โparent_recordโ in all records to filter out all splits associated with a record. The original record will have โselfโ as the value in this field, and split records will have โnode_id:language_codeโ as the value. This field is used to delete all splits associated with a record when a node is modified/deleted. It will also help distinguish between the original record and splits if you are building the search UI using JS. The โparent_recordโ field should be configured as a filter from the Algolia dashboard for this to work.
- Works with multilingual content.
- ๐บ๐ธUnited States maskedjellybean Portland, OR
Thank you for carrying this forward! Sadly I no longer have an Algolia project to work with so I can't test the new patch out.
- Status changed to Postponed: needs info
4 months ago 8:42am 21 July 2024 - ๐ฎ๐ณIndia nikunjkotecha India, Gujarat, Rajkot
This is good. I am not convinced though that we should index huge objects in Algolia, can we have some real use case to help understand the need for this?
- ๐บ๐ธUnited States maskedjellybean Portland, OR
The use case is if you want to index more than 10000 characters in one record. :-)
Algolia offers the ability to split records in order to get around their character count limitation, so it would be great if search_api_algolia leveraged this ability.
Potentially site builders/developers may not realize their records are being truncated. When it is truncated search does not search the entire record because only part of it is indexed. This means worse search results without any indication why.
- ๐บ๐ธUnited States kevinb623
This patch is working wonderfully to properly index and discover lengthy pages on a content rich website we manage.
Only suggestion is to update ItemSplitter.php line 68 to use isset() to reduce PHP warnings related to unknown and null array keys.
Very nice work!
- ๐ฌ๐งUnited Kingdom reecemarsland
Our use case is indexing PDF files attached to content and we need the PDF content to be searchable.
- ๐ง๐ชBelgium Den Tweed
Same as in #17 our use case is making attached documents searchable
I've worked further on patch #11 and changed following:
- Fixed warning for getSplitsForItem(), the whole method could be reduced to a simple ?? statement
- Removed the getDataTypeHelper() and setDataTypeHelper() overrides as they aren't changed from the parent class
- Moved the code from processFieldValue() to process() and removed the string type check. As far as I understand 'String' as dataType is for shorter field values (e.g. Title, url, etc...) and should already be shorter than the limit in most cases. It's 'Text' (aka Fulltext) we need the most here imo, but in general anything that is considered string characters. This is already covered by the shouldProcess() method (has an is_string() check) which is the condition to call process() (which in turn calls processFieldValue()). As process is an empty method there's no need to override the processFieldValue() code
- Status changed to Needs review
2 days ago 10:57am 20 November 2024 - ๐ง๐ชBelgium dieterholvoet Brussels
I am not convinced though that we should index huge objects in Algolia, can we have some real use case to help understand the need for this?
We hit this limit regularly on projects, when e.g. indexing long text fields or paragraphs for search. This is a very valid use case.
- ๐ง๐ชBelgium dieterholvoet Brussels
I started a MR based on the latest patch. I'm sometimes still getting the following error, even with the patch applied:
Record at the position 46 objectID=entity:node/118:bg-split-processed_2-1 is too big size=15808/10000 bytes. Please have a look at https://www.algolia.com/doc/guides/sending-and-managing-data/prepare-you...
I'll do some debugging.
- ๐ง๐ชBelgium dieterholvoet Brussels
I can't figure out the problem. That project might have been using an outdated patch, I updated it and I'll wait and see if the issue happens again.