- Issue created by @tuwebo
- 🇩🇪Germany mkalkbrenner 🇩🇪
I would not use the direct query parser for public sites, just for internal tools.
The problem is that Search API doesn't know the concept of boolean operators.
A good implementation would be to add such a parse mode to Search API and to handle it in Search API Solr.
This way we would not open the entire query language to the user.A shortcut might be to add a "boolean operators" query parser plugin to seqrch_api_solr only.
- 🇪🇸Spain tuwebo
Hello @mkalkbrenner, thank you very much for taking your time and fast response.
I will start taking a look at search_api_solr, which seems a faster approach and easier to implement, then maybe take a look at search_api which is the optimal solution. - 🇪🇸Spain tuwebo
A potentially Direct parse_mode handling boolean operators and grouping could look like this:
namespace Drupal\search_api_solr\Plugin\search_api\parse_mode; use Drupal\Component\Utility\Unicode; use Drupal\search_api\Plugin\search_api\parse_mode\Direct; /** * Represents a parse mode that handles Boolean operators and grouping. * * @SearchApiParseMode( * id = "direct_boolean_operators", * label = @Translation("Direct query boolean operators"), * description = @Translation("A direct query allowing boolean operators and grouping. Might fail if the query contains syntax errors in regard to the specific server's query syntax."), * ) */ class DirectBooleanOperators extends Direct { /** * {@inheritdoc} */ public function parseInput($keys) { // Check if input is an array. if (is_array($keys)) { // Validate each element in the array. foreach ($keys as $key) { if (!Unicode::validateUtf8($key)) { return ''; } } // Convert array to string with spaces between elements. $keys = implode(' ', $keys); } else { // Validate the single string input. if (!Unicode::validateUtf8($keys)) { return ''; } } // Test string // "Drupal 10 theming" AND (views OR "content types") NOT "user authentication" + performance~2 OR security^2 && (module || plugin) !deprecated // Boolean operators and valid symbols. // ['AND', 'OR', 'NOT', '&&', '||', '!', '+', '-']; // Valid group and scape chars. // ['(', ')', '\']; // Normalize whitespace. $keys = preg_replace('/\s+/u', ' ', trim($keys)); // Handle Boolean operators and symbols, remove extra whitespaces. $keys = preg_replace('/\s(AND|OR|NOT|!|\|\||&&)\s/', ' $1 ', $keys); // Define special characters to escape. $escape_special_chars = ['{', '}', '[', ']', '^', '~', '*', '?', ':']; // Handle special characters outside of quotes. $keys = preg_replace_callback('/("[^"]+")|\S+/', function($matches) use ($escape_special_chars) { if (isset($matches[1])) { // This is a quoted phrase, don't modify anything inside return $matches[0]; } else { // This is not a quoted phrase, escape only the specified special characters $term = $matches[0]; foreach ($escape_special_chars as $char) { $term = str_replace($char, '\\' . $char, $term); } return $term; } }, $keys); // @TODO // Handle NegativeQueryProblems: Pure Negative Queries // https://cwiki.apache.org/confluence/display/SOLR/NegativeQueryProblems#NegativeQueryProblems-PureNegativeQueries return $keys; } }
- 🇪🇸Spain tuwebo
A potential Query parser that could fit is the Simple Query Parser https://solr.apache.org/guide/8_1/other-parsers.html#simple-query-parser
Which can be added in the solrconfig_extra.xml this way or use the PostConfigFilesGenerationEvent:
<queryParser name="simple" class="solr.SimpleQParserPlugin"/>
I think we should NOT allow WHITESPACE operator (at least), but there is an easy way to restrict it using a list of allowed ones with the parameter q.operators
The downside is we won't be able to handle Function Queries https://solr.apache.org/guide/8_1/function-queries.html
- 🇩🇪Germany mkalkbrenner 🇩🇪
Both options seem to be good.
Using the Simple Query Parser seems to be straight forward. But we should also think about how Search API processors will work with it.
But it is worth a try. - 🇺🇸United States pramodganore
I did notice however the Solr searches are case sensitive for the operator
Example -“and”, “AND”, “And”Unlike power users, general users would not be aware of the subtle differences.
Is there an existing solution, a checkbox away already built into search ?
- 🇪🇸Spain tuwebo
Hello @pramodganore, thanks for taking the time to look at it.
The code in the comment #3459227-7: Potential risks using "Direct query" parse mode with views? → was just a proof of concept with very basic code (be aware of it and read carefully some implications in the issue's description). There is no further code, although I am still improving it but not sure when will be ready for posting it here.
Also a lot of testing will need to be done and probably, as @mkalkbrenner mentioned, Search API processors will not fully work (I am thinking for example about the Highlight.
That being said, maybe the best solution could be just customizing your search form by adding some kind of tips for the final user about how they should use it, since solr is very picky about the syntax (not only case sensitive, but also some user may forget to close the single quotes, double quotes, parenthesis...).
I am also working in the other approach by using the "Simple Query Parser", but first try yield some unexpected results, I come with something useful I'll update this issue. - 🇺🇸United States pramodganore
Also noticed when i search with “something” vs something. the highlight does not apply when searching with double quotes.
i do understand the implications, saw the #todo. for my use case we only need the basic boolean search options. nothing advanced. Really appreciate you responding back