- Issue created by @artemboiko
- First commit to issue fork.
- Merge request !52Issue #3444888: LIKE and EXACT filter for searching fulltext → (Open) created by mparker17
- Open on Drupal.org →Core: 10.2.1 + Environment: PHP 8.1 & MySQL 5.7last update
7 months ago Waiting for branch to pass - Status changed to Needs review
7 months ago 4:35pm 2 May 2024 - Open on Drupal.org →Core: 10.2.1 + Environment: PHP 8.1 & MySQL 5.7last update
7 months ago Waiting for branch to pass - 🇨🇦Canada mparker17 UTC-4
@artemboiko, thank you! (and apologies: I only have expertise in the 8.0.x branch of the module, so I'm unable to help move the 8.x-7.x branch forward!)
I've created a merge request: lets see if tests pass.
Offhand, the code looks fine thus far. May I request that you update the test at
tests/src/Unit/SearchAPI/Query/FilterBuilderTest.php
to test the new code? (this will ensure that future changes don't accidentally break the LIKE and EXACT functionality you need!)Thank you very much!
- 🇨🇦Canada mparker17 UTC-4
If possible, it would be nice to have a NOT LIKE filter too, as put forward in #3092486-10: LIKE filter missing → . That issue (which is a lot older and uses the old, switch-based syntax) suggests the following code...
+ case 'LIKE': + $filter = [ + 'bool' => [ + 'should' => [ + 'wildcard' => [$condition->getField() => $condition->getValue()], + ], + ] + ]; + break; + + case 'NOT LIKE': + $filter = [ + 'bool' => [ + 'must_not' => [ + 'wildcard' => [$condition->getField() => $condition->getValue()], + ], + ] + ]; + break; +
... I'd like to close that other issue in favor of this one, so I'm pasting it here so that the code isn't forgotten.
- 🇨🇦Canada mparker17 UTC-4
Crediting @abrar_arshad because they proposed a patch in ✨ LIKE filter missing Closed: duplicate
- 🇨🇦Canada mparker17 UTC-4
Crediting @artemboiko because they were not credited automatically
Also crediting @sokru because they provided direction in ✨ LIKE filter missing Closed: duplicate
- 🇨🇦Canada mparker17 UTC-4
Crediting @kevinn because they proposed a patch in ✨ LIKE filter missing Closed: duplicate
- Assigned to mparker17
- 🇨🇦Canada mparker17 UTC-4
Assigning to myself to see if I can integrate the NOT LIKE filter, and add some tests.
- 🇨🇦Canada mparker17 UTC-4
That seemed to work!
Although that being said, our tests (currently) only test that the queries get constructed in the way we expect, i.e.: we don't (yet) test that Elasticsearch 8 can parse them or that they return results that make sense for that kind of query.
I noticed that both the
LIKE
andNOT LIKE
filters in ✨ LIKE filter missing Closed: duplicate both specifywildcard
queries, but theLIKE
filter from this issue does not. However, I copied theNOT LIKE
query construction from ✨ LIKE filter missing Closed: duplicate for now.We don't seem to use
wildcard
queries anywhere else in the 8.0.x version of this module. Our sibling project, Search API OpenSearch → doesn't either (or at least, not in their 3.x branch).I'm not exactly certain why we aren't using
wildcard
queries, but I daresay for now we should try to be internally consistent, i.e.: change theNOT LIKE
query I just added fromwildcard
to something else; but I'm not particularly familiar with raw Elasticsearch query construction, so I'm not certain if — for example — changingwildcard
toterm
would work without manual testing (which I will try shortly). - 🇨🇦Canada mparker17 UTC-4
Crediting @karma86 for the patch in ✨ LIKE and EXACT filter for searching fulltext Active (for this module's 8.x-7.x branch)
- 🇨🇦Canada mparker17 UTC-4
Should probably add ✨ LIKE filter missing Closed: duplicate as a related issue too.
- Issue was unassigned.
- 🇨🇦Canada mparker17 UTC-4
I grepped the code in Search API, and a number of modules providing search backends for it → to see if I could find any other examples of
LIKE
and/orNOT LIKE
filters... the only match was insearch_api
'ssearch_api_db
submodule (i.e.: an SQL-based search backend), which implementedLIKE
andNOT LIKE
by escaping any%
and_
wildcard characters in the search term, then performing an SQLLIKE '%term%'
query.That says to me that end-users shouldn't be able to enter raw wildcards, meaning that we shouldn't be creating a
wildcard
query in our implementation. (but maybe I misread, please correct me if I am wrong). From a security perspective, I can imagine that allowing end-users to enter wildcards directly means that malicious end-users could create wildcard-filled queries to slow down the back-end, potentially causing problems for other users. But, I don't know if Elasticsearch has some built-in guardrails to prevent that.It is also worth noting that all of our other filter term operators create
term
queries (which look for exact matches). However, @artemboiko's implementation forLIKE
in #2 ✨ LIKE and EXACT filter for searching fulltext Needs review creates amatch
query (which fuzzy-matches the search term instead of looking for an exact match). Fuzzy-matching is not the same thing as an SQLLIKE '%term%'
, but I daresay that @artemboiko's implementation feels like the right solution here... it seems to me thatLIKE
should be more lenient than=
; andNOT LIKE
seems as if it should be more lenient than<>
... and while we could implementLIKE
andNOT LIKE
exactly likesearch_api_db
does, it seems to me like one of the reasons why a someone might want to use ElasicSearch instead of SQL is because ElasticSearch has fuzzy-matching and SQL does not.However, if you have a good counter-point, I'd be interested to hear it!
Acting on the above thoughts, I have left the
LIKE
implementation the way that @artemboiko wrote it, and changed theNOT LIKE
implementation to avoid creating a wildcard query...diff --git a/src/SearchAPI/Query/FilterBuilder.php b/src/SearchAPI/Query/FilterBuilder.php index 96438b2..ed17010 100644 --- a/src/SearchAPI/Query/FilterBuilder.php +++ b/src/SearchAPI/Query/FilterBuilder.php @@ -190,7 +190,12 @@ class FilterBuilder { 'NOT LIKE' => [ 'bool' => [ 'must_not' => [ - 'wildcard' => [$condition->getField() => $condition->getValue()], + 'match' => [ + $condition->getField() => [ + 'query' => $condition->getValue(), + 'fuzziness' => 'AUTO', + ] + ], ], ], ],
I think this is ready for review now, but since I worked on the patch, I cannot RTBC it anymore. So I'll unassign myself, and leave this as "Needs review" for others to provide feedback.
Thank you in advance! (and sorry for the long comment)
- 🇫🇮Finland sokru
Looks good to me, only thing I was wondering that we hardcode the fuzziness value here to "auto" and someone might expect to get the setting from here: https://git.drupalcode.org/project/elasticsearch_connector/-/blob/8.0.x/...