- Issue created by @brockfanning
- Status changed to Postponed: needs info
9 months ago 4:38pm 7 April 2024 - 🇦🇹Austria drunken monkey Vienna, Austria
It seems like you either have Solr set up incorrectly (if you customized the configuration compared to the one the Search API Solr module generates) or the keywords do not correctly reach Solr. Normally, if something is a stopword, it should be properly ignored both during indexing and during searching. If it is only ignored during indexing then you’d see the issue you describe. Properly configured, though, the search terms
walk the dog
should be treated exactly likewalk dog
, completely ignoring the “the”. (Ignoring a bit of magic to make phrase searches still work properly.)Anyways, search views can also configured to ignore keywords below a certain length – see the “Minimum keyword length” of the “Search: Fulltext search” filter. So maybe that’s the problem? Though this should also just discard those short keywords completely, not cause 0 results, so that could not be the entire explanation.
My first step would be to check the request that is sent to Solr. If the keywords appear there, then you are likely looking at a Solr configuration issue. If not, then the problem is somewhere in the Drupal configuration – or, probably, in you custom code.
In any case, there has to be something wrong with your setup specifically – in general, this should all work fine out-of-the-box.
As a last ressort, if you can identify why those short words are treated specially, you could also remove that configuration to just have them treated normally. Having them cause empty result sets is of course the worst possible outcome for common words. - 🇺🇸United States brockfanning
Thanks so much for the reply! We use Acquia for hosting, and we're experiencing this with one of the built-in "configsets", which is described as: "(Latest) Drupal 9 / 10 - Search API Solr 4.3.2 - Solr 8 - [drupal-4.3.2-solr-8.x-0] - v1.0"
As for module versions, we're using Search API 1.3.1, and Search API Solr 4.3.2.
It's very helpful to know how it should behave, thank you. I will go ahead with your recommendation of trying to see the request that is sent to Solr (I'm not sure offhand how to do that on Acquia hosting but I will research). In the meantime, if you have any prior experience with Acquia's configsets and know of any gotchas, please let me know. I can download the config set so if there is anything in there I should paste here, I can.
Finally in regards to custom code - I tried creating a brand new view which would not have been affected by our custom code (our custom code is tied to a specific view id) and still experienced the problem, so I think we can rule out our custom code as a culprit.
- 🇦🇹Austria drunken monkey Vienna, Austria
I will go ahead with your recommendation of trying to see the request that is sent to Solr (I'm not sure offhand how to do that on Acquia hosting but I will research).
If you’re using Devel, there is the Search API Solr Devel module (already included when downloading Search API Solr) that lets you log Solr requests. Otherwise, you can add custom code mimicking
\Drupal\search_api_solr_devel\Logging\SolariumRequestLogger
.In the meantime, if you have any prior experience with Acquia's configsets and know of any gotchas, please let me know. I can download the config set so if there is anything in there I should paste here, I can.
The configuration of the field type used for your fulltext fields would be interesting, but for that we’d first need to know the Solr field names of those fields. Then you can look up the used type by comparing the
<dynamicField>
elements in the Solr schema (usually contained in theschema_extra_fields.xml
file). Finally, find the<fieldType>
element where thename
attribute matches the field type and post that here. - 🇺🇸United States brockfanning
I've installed search_api_solr_devel and now I can see the debugging info when I run a search. I thought it would be a little more obvious what I was looking for, but unfortunately I'm being dense. I'm looking for any evidence of my search terms, and I do see them in Solr response body -> responseHeader -> params -> q. If that's the correct place to look, it appears that the stopwords do show up there.
For example if I search for "justice lawyers" then I see this:
q => string (21) "+"justice" +"lawyers""
But if I search for "justice and lawyers" then I see this:
q => string (28) "+"justice" +"and" +"lawyers""
As for the Solr configuration, I unsure of what I'm looking for there too. On /admin/config/search/search-api/server/[my server name]/solr_field_type I clicked on "Get schema_extra_types.xml" and I see a file with a long list of elements, which mostly appear to be specific to languages (our site is multilingual). The first example is:
<dynamicField name="ts_X3b_ar_*" type="text_ar" stored="true" indexed="true" multiValued="false" termVectors="true" omitNorms="false" />
I gather that I need to compare the name attribute "ts_X3b_ar_*" with something else in another file, but I'm stumped on where to look next.
- 🇦🇹Austria drunken monkey Vienna, Austria
As the keywords are passed as-is to Solr (yes, they are in the
q
parameter) it seems this is a Solr configuration problem. I’m therefore moving your issue to that queue.It seems you are searching for English text, so probably you’re looking for the
ts_X3b_en_*
andtm_X3b_en_*
dynamic fields. They most likely both have typetext_en
, so you’re looking for<fieldType name="text_en" …
insideschema_extra_types.xml
. - 🇩🇪Germany mkalkbrenner 🇩🇪
Stopwords are adjustable. They're managed as drupal configs and get applied if you generate and deploy a configset.
But if acquia doesn't allow to update the configset, I asume that they're using the the default. And in the default, "and" and "of" are declared as stopwords.AFAIK Acquia still forces the dismax query parser which also leads to different results and a different interpretation of "all words" and "any word".
- 🇺🇸United States brockfanning
Thanks all for the feedback. I appreciate the help. Unfortunately I'm still am not sure how to resolve the problem. I am perfectly happy to have these stopwords, but I need them to be ignored, rather than actually affecting the query. If the user enters any stopwords at all, 0 results are returned.
I hope that someone else who ran into this might be able to help with some guidance on the correct combination of settings to get around this problem.
- 🇺🇸United States brockfanning
Acquia support pointed me to this article, which does seem to resolve the problem: https://acquia.my.site.com/s/article/No-Search-Results-when-stopwords-ar...
In a nutshell, it says to use "Direct query" instead of "Multiple words" for the parser mode.
- 🇩🇪Germany mkalkbrenner 🇩🇪
Sorry, but this recommendation is stupid.
The direct query parser uses the stop words, too. That depends on the fields that a queries, not the parser.The problem is that they still force edismax internally in the connector. I explained multiple times why this is a bad idea, but they're ignoring it.
- 🇺🇸United States brockfanning
I can definitely follow up with them on this! I can pass long the idea of not forcing edismax internally. Do I understand correctly that if they did not force edismax, that we could potentially continue using "Multiple words"?
- 🇺🇸United States japerry KVUO
The problem is that they still force edismax internally in the connector. I explained multiple times why this is a bad idea, but they're ignoring it.
Per Markus' recommendation, we did add a feature to disable the edismax functionality quite a while ago #3305163: Most Search API boost processors have no effect → , and its set to be disabled by default:
https://git.drupalcode.org/project/acquia_search/-/commit/03d2d59b2e9c6b...Unless there is something I'm missing, please file a specific issue. But I don't think his comment in 2024 reflects the reality of the module.
The direct query parser uses the stop words, too. That depends on the fields that a queried, not the parser.
Acquia provides customers a way to customize configsets, so whatever issue you're seeing is likely solr configuration. Those working on the connector module here have limited experience on actual solr usage, so I'd differ to Markus or someone else in the solr community on how to make that work. If there is something that needs to change in the module, feel free to raise a specific issue.
- 🇺🇸United States brockfanning
Thank you @japerry that's great to know. I had not seen any toggle for edismax in any Search API forms, and in fact I still cannot see them. Is this toggle supposed to appear in the UI anywhere?
I do see this in my config file:
third_party_settings: acquia_search: use_edismax: true
So for now I will try changing this to false directly in the config file and report back.
- 🇺🇸United States japerry KVUO
Its part of the index and not the server itself. Attached a screenshot. You should be able to find it by going to: "en/admin/config/search/search-api/index/acquia_search_index/edit"