- π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
My understanding is all fields are multi-value fields, even if there is just one value. There is no array data type.
- Status changed to Postponed: needs info
over 1 year ago 12:35am 21 March 2023 - π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Postponing unless someone has more input on my previous comment.
- Status changed to Active
over 1 year ago 4:45am 27 March 2023 - π¦πΊAustralia tallytarik
I've just run into this trying to use the Neural Search plugin with a custom
knn_vector
field and ingest pipeline. That field type only supports a single value, and throws an error when it's passed an array. This is what happens at the moment because the input text field (title
in my case) is indexed as an array:[error] failed to parse field [title_embedding] of type [knn_vector] in document with id 'entity:node/12345:en'. Preview of field's value: '{knn=[...]}'. Current token (START_OBJECT) not numeric, can not use numeric value accessors
I've hacked together a change to
IndexParamBuilder::buildFieldValues()
to return the value as a string instead of an array, and can confirm it now works. Something like the patch π¬ Source Fields in Elasticsearch Index are arrays RTBC in the linked issue might be the way to go - check the field cardinality for each field, and if it's 1, process and return the first (and only) value rather than as an array. I'm pretty new to OpenSearch so not 100% across if there could be other impacts of that change, though. - achap π¦πΊ
Just wanted to say I had the exact same issue as tallytarik. For the most part everything being an array did not affect anything apart from when implementing a
knn_vector
field. It throws the same error for a multi value field. I used theIndexParamsEvent
to alter the field to be single value and it works. - π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
I'm happy to consider this in the next major release branch. I think it would be a BC break and not sure if there would be an upgrade path needed.
- Status changed to Needs work
5 months ago 3:55am 18 June 2024 - π¦πΊAustralia kim.pepper πββοΈπ¦πΊSydney, Australia
Spent some time on looking at the patch and how this could be implemented here. The code that checks for whether a field is a list or not is quite complex and seems to indicate there is a lack of trust in the TypeData definition
isList()
method.If we were to proceed, I would expect we would need pretty decent Kernel test coverage to ensure indexing and querying work as expected.