- Issue created by @tuwebo
- 🇩🇪Germany mkalkbrenner 🇩🇪
I would not use the direct query parser for public sites, just for internal tools.
The problem is that Search API doesn't know the concept of boolean operators.
A good implementation would be to add such a parse mode to Search API and to handle it in Search API Solr.
This way we would not open the entire query language to the user.A shortcut might be to add a "boolean operators" query parser plugin to seqrch_api_solr only.
- 🇪🇸Spain tuwebo
Hello @mkalkbrenner, thank you very much for taking your time and fast response.
I will start taking a look at search_api_solr, which seems a faster approach and easier to implement, then maybe take a look at search_api which is the optimal solution. - 🇪🇸Spain tuwebo
A potentially Direct parse_mode handling boolean operators and grouping could look like this:
namespace Drupal\search_api_solr\Plugin\search_api\parse_mode; use Drupal\Component\Utility\Unicode; use Drupal\search_api\Plugin\search_api\parse_mode\Direct; /** * Represents a parse mode that handles Boolean operators and grouping. * * @SearchApiParseMode( * id = "direct_boolean_operators", * label = @Translation("Direct query boolean operators"), * description = @Translation("A direct query allowing boolean operators and grouping. Might fail if the query contains syntax errors in regard to the specific server's query syntax."), * ) */ class DirectBooleanOperators extends Direct { /** * {@inheritdoc} */ public function parseInput($keys) { // Check if input is an array. if (is_array($keys)) { // Validate each element in the array. foreach ($keys as $key) { if (!Unicode::validateUtf8($key)) { return ''; } } // Convert array to string with spaces between elements. $keys = implode(' ', $keys); } else { // Validate the single string input. if (!Unicode::validateUtf8($keys)) { return ''; } } // Test string // "Drupal 10 theming" AND (views OR "content types") NOT "user authentication" + performance~2 OR security^2 && (module || plugin) !deprecated // Boolean operators and valid symbols. // ['AND', 'OR', 'NOT', '&&', '||', '!', '+', '-']; // Valid group and scape chars. // ['(', ')', '\']; // Normalize whitespace. $keys = preg_replace('/\s+/u', ' ', trim($keys)); // Handle Boolean operators and symbols, remove extra whitespaces. $keys = preg_replace('/\s(AND|OR|NOT|!|\|\||&&)\s/', ' $1 ', $keys); // Define special characters to escape. $escape_special_chars = ['{', '}', '[', ']', '^', '~', '*', '?', ':']; // Handle special characters outside of quotes. $keys = preg_replace_callback('/("[^"]+")|\S+/', function($matches) use ($escape_special_chars) { if (isset($matches[1])) { // This is a quoted phrase, don't modify anything inside return $matches[0]; } else { // This is not a quoted phrase, escape only the specified special characters $term = $matches[0]; foreach ($escape_special_chars as $char) { $term = str_replace($char, '\\' . $char, $term); } return $term; } }, $keys); // @TODO // Handle NegativeQueryProblems: Pure Negative Queries // https://cwiki.apache.org/confluence/display/SOLR/NegativeQueryProblems#NegativeQueryProblems-PureNegativeQueries return $keys; } }
- 🇪🇸Spain tuwebo
A potential Query parser that could fit is the Simple Query Parser https://solr.apache.org/guide/8_1/other-parsers.html#simple-query-parser
Which can be added in the solrconfig_extra.xml this way or use the PostConfigFilesGenerationEvent:
<queryParser name="simple" class="solr.SimpleQParserPlugin"/>
I think we should NOT allow WHITESPACE operator (at least), but there is an easy way to restrict it using a list of allowed ones with the parameter q.operators
The downside is we won't be able to handle Function Queries https://solr.apache.org/guide/8_1/function-queries.html
- 🇩🇪Germany mkalkbrenner 🇩🇪
Both options seem to be good.
Using the Simple Query Parser seems to be straight forward. But we should also think about how Search API processors will work with it.
But it is worth a try.