This did not have a lot of effect as I thought - there are still hundreds of calls to node_access and taxonomy_index per page, even as just myself browsing the site on a lower environment.
Landed on this issue like others diagnosing what could be wrong. In my case, I noticed a few things.
permissions_by_term_node_access
is called several times per page, as designed. However, the service it calls does not account for a few things:
- Whether the target page contains a field that is used by PBT (it only checks if it has a taxonomy term field)
- Whether the target page(s) in canUserAccessByNode have values for those field(s).
In our case, we only use one vocab for PBT and only one a few content types. A majority of pages that have the field have no value.
I am going to try the following:
- Limit isAnyTaxonomyTermFieldDefinedInNodeType to respecting the configuration of PBT:
/**
* Checks whether there are taxonomy fields defined in a given node type.
*/
public function isAnyTaxonomyTermFieldDefinedInNodeType(string $nodeType): bool {
$fieldDefinitons = $this->entityFieldManager->getFieldDefinitions('node', $nodeType);
$config = \Drupal::config('permissions_by_term.settings');
$pbt_target_bundles = $config->get('target_bundles') ?? [];
foreach ($fieldDefinitons as $fieldDefiniton) {
if ($fieldDefiniton->getType() === 'entity_reference' && is_numeric(strpos($fieldDefiniton->getSetting('handler'), 'taxonomy_term'))) {
$field_target_bundles = $fieldDefiniton->getSetting('handler_settings')['target_bundles'];
if (array_intersect_key(array_flip($field_target_bundles), array_flip($pbt_target_bundles))) {
return TRUE;
}
}
}
return FALSE;
}
- Check if the $node has values for the taxonomy field in canUserAccessByNode - if it doesn't, then allow the user access:
$configPermissionMode = $this->configFactory->get('permissions_by_term.settings')->get('permission_mode');
$requireAllTermsGranted = $this->configFactory->get('permissions_by_term.settings')->get('require_all_terms_granted');
if (!$configPermissionMode && !$requireAllTermsGranted) {
$access_allowed = TRUE;
}
else {
$access_allowed = FALSE;
}
if ($node->hasField('field_access_restrictions') && $node->get('field_access_restrictions')->isEmpty()) {
return TRUE;
}
// ... rest of method
I am hoping to see some improved NewRelic reports from this - we are seeing a ton of activity on MySQL and taxonomy_index degrading performance and I believe 90% of it does not need to occur.
Circling back to this, I added in the #access FALSE which should prevent it (in normal rendering?) to not be printed. I updated the tests to ensure trying to render produces empty output. This may also be a case where there is markup outside of the twig tag that checks `{% if content %}` or similar evaluation to ensure it is not empty.
A more complete example may look something like this:
/**
* {@inheritdoc}
*/
public function postExtractResults(PostExtractResultsEvent $event): void {
$query = $event->getSearchApiQuery();
if ($query->getIndex()->isValidProcessor('solr_densevector')) {
try {
$processor = $query->getIndex()->getProcessor('solr_densevector');
$settings = $processor->getConfiguration();
if (!empty($settings['content_field'])) {
$results = $event->getSearchApiQuery()->getResults();
foreach ($results as $result) {
$field = $result->getField($settings['content_field']);
$field_values = [];
$values = $field->getValues();
foreach ($values as $value) {
$text = ($value instanceof TextValue) ? $value->getText() : $value;
$text = strip_tags($text);
$field_values[] = $text;
}
$result->setExtraData('content', implode(' ', $field_values));
}
}
}
catch (\Exception $e) {
// log here
}
}
}
kevinquillen → created an issue. See original summary → .
I see what you mean, but where would that be set? In a ProcessingResultsEvent event subscriber?
Also, I spent yesterday looking into this for the first time (RAG tool with AI Chatbot) and in the cases where you want to use a View or different Solr options (parse mode, dismax/edismax, etc) its lost in the tool because its called and executed directly. This was causing me to never get any results.
That caused me to write my own plugin like:
/**
* {@inheritdoc}
*/
public function execute() {
// Collect the context values.
$this->searchString = $this->getContextValue('search_string');
$end_results = [];
try {
$view = Views::getView('search');
$view->setDisplay('sitesearch');
$view->setExposedInput(['keyword' => $this->searchString]);
$view->execute();
$i = 1;
/** @var \Drupal\views\ResultRow $result */
foreach ($view->result as $result) {
/** @var \Drupal\node\NodeInterface $node */
$node = $result->_entity;
$url = $node->toUrl()->toString();
$end_results[] = "#$i: " . $node->getTitle() . " (<a href=\"$url\">visit page</a>)<br>";
$i++;
if ($i >= 6) {
break;
}
}
}
catch (\Exception $e) {
$this->setOutput("We're sorry, but I could not search the site. Please give me a few moments and try again.");
return;
}
if (count($end_results)) {
$output = "I was able to find some relevant content for you! Here are the top results based on what you asked:<br><br>";
$output .= implode("<br/><br/>", $end_results);
$output .= "<br/><br/>-----<br/>Didn't find what you were looking for? Try using our more robust <a href='/search'>site search</a>!";
$this->setOutput($output);
}
else {
$this->setOutput("No results were found when searching in the rag index " . $this->index . " for the following prompt: " . $this->searchString . ".\n");
}
}
Which produced results. I could also confirm that the event subscriber in this module was also fired, so it performed a RAG search against Solr. These are probably items to report back to the main ai_search module because it can be a little confusing when the RAG tool returns nothing. I see that ai_search module provides its own search backend that creates the content property.
If I add that event to the current query event subscriber:
public function postExtractResults(PostExtractResultsEvent $event): void {
$results = $event->getSearchApiQuery()->getResults();
foreach ($results as $result) {
$result->setExtraData('content', 'Content string here');
}
}
I can see the result on the FunctionTool plugin I made. Perhaps the shortest solution here is adding on the processor setting to ask the user "which" field should be returned as 'content' in this context.
Does this include where CKEditor can stream the response?
if ($response instanceof StreamedChatMessageIteratorInterface) {
return new StreamedResponse(function () use ($response) {
foreach ($response as $message) {
echo $message->getText();
ob_flush();
flush();
}
}, 200, [
'Cache-Control' => 'no-cache, must-revalidate',
'Content-Type' => 'text/event-stream',
'X-Accel-Buffering' => 'no',
]);
}
Looks like this was somewhat fixed in alpha2. I have updated the behavior name to match naming conventions, removed superflous comments and removed the CSS animations. The animations may not look good in different admin themes and can present some accessibility challenges.
Are you able to see the size property in the configuration files on Solr? It should reflect the size of the model (1536) on the processor settings. If not, you may need to upload that configset and reload the core. Otherwise, it may simply just be a warning from Lucene/Solr.
What version of Solr are you running? IIRC this dimension size was capped at 1024 until around Solr 9.3.
Sure, I'm interested in seeing a conceptual solution of old way (no HTMX) / new way with HTMX. Does it also work for Form AJAX?
I read the CRs, but could the examples be expanded on a bit?
A quick naive change I made was to replace any '/' in a model name with '+' so the URL didn't break, then rewriting the '+' back to '/' for the form and title method(s);
This works, even though Ollama models cannot be (at least this one) edited:
Either way, I think such a change is necessary to make these screens function or alternatively handling model_id in the route differently so it works for all cases.
kevinquillen → created an issue. See original summary → .
This is because the event subscriber fires when the module is enabled and the event subscriber loads the processor from the index and runs operations. Now there is a check first to see that solr_densevector is a valid processor before continuing. However, if you remove this processor, you should also remove any 'Dense Vector' field configured on the index because it will fail at index time trying to store non vectorized value into the vector field in Solr.
Not sure if this is related to this issue, but I wound up in the same area of code that the patch is addressing but for another reason.
My case: I have a Views REST display with a path of api/v1/foo/bar/%node. It has two contextual filters. The first one uses the URL value to load a node for the contextual filter. The second one uses the currently logged in user.
The admin View UI preview works fine by just passing a node id in "Preview with contextual filters". However, if I am trying to assemble the URL to pass along for a decoupled React app, Views requires the two arguments:
$view_url = $this->request->getSchemeAndHttpHost() .
$view->getUrl(
[
$node->id(),
\Drupal::currentUser()->id(),
]
)->toString();
If I visit api/v1/foo/bar/(node id) directly, I get results without needing the additional argument in the URL. If I curl the URL (with an authenticated cookie value from my session) I get a value.
I can alternatively do this:
if ($display_id == 'my_view_id' && !empty($node->id())) {
$view_url = $this->request->getSchemeAndHttpHost() . Url::fromUserInput('/' . str_replace('%node', $node->id(), $view->getPath()))->toString();
}
but that is not so great to read or maintain. Should the URL be the equivalent of the path? How is that interpreted?
kevinquillen → created an issue.
Placeholder isn't clear. The field defaults to 11434 and has a note in the field description. Setting to NR.
For DDEV and typical Docker based services, the service name is enough (Docker will resolve this behind the scenes). http://ollama
will work - similar to running Solr in DDEV.
kevinquillen → made their first commit to this issue’s fork.
kevinquillen → created an issue.
kevinquillen → created an issue.
kevinquillen → created an issue.
That is correct, we were told because of the use of $_SESSION to pass state and values around. It does not persist or not guaranteed to be valid when read in all cases (it always worked locally in DDEV).
mably → credited kevinquillen → .
Unavoidable at the time - open a new issue and should be an easy fix. In the long run we should probably figure out a better way to handle this, even though new numeric series models for OpenAI are not too frequent.
Just as an update here:
1. On first index creation, go to the Processors tab.
2. Enable the DenseVector processor.
3. Configure the processor - first select the provider and save, then reload and pick the AI model (this will be improved later).
4. Save.
This should fix it. Why it errors on first creation of an index I am not sure of yet.
kevinquillen → created an issue. See original summary → .
kevinquillen → created an issue.
kevinquillen → created an issue.
In that case this change should likely apply to both the variables and token section, not just the token section.
I think this is because of submit buttons not having a unique #name value. If I give Remove a unique value like 'remove_token_$i' then extract that element in the ajax callback, the issue goes away.
kevinquillen → created an issue.
I also cannot see the Close button in Gin, but I can see it in the source of the modal. I had to write a lot of CSS overrides to get this to work for me in Gin:
html .ui-dialog .ui-dialog-titlebar .ui-dialog-titlebar-close {
margin: 12px 5px 0 0 !important;
padding: 0 !important;
opacity: 1 !important;
inline-size: 25px;
background: none;
}
html .ui-dialog .ui-dialog-titlebar .ui-dialog-titlebar-close .ui-icon.ui-icon-closethick {
background: #fff !important;
}
Now I can see the close icon.
This seems like it needs a reroll for 11.2.
Quick question, how were you able to get multiple search terms passed?
kevinquillen → made their first commit to this issue’s fork.
Please read the README for Search API Solr on how to generate and update XML configuration sets for Solr. It will do the necessary work and not require manual XML editing.
It is expected you follow Search API Solr 4.3+ setup instructions, yes, on configuring Drupal for Solr and generating the necessary configuration for it.
As for Solr itself, I can't speak to that (how anyone installs it). I used DDEVs Solr instance and it worked perfectly fine.
kristen pol → credited kevinquillen → .
kevinquillen → made their first commit to this issue’s fork.
Yeah, I noted that in #8 https://www.drupal.org/project/content_moderation_notifications/issues/3... 🐛 Unable to send email with reply-to not set error Active
kevinquillen → made their first commit to this issue’s fork.
Rebased branch so the patch can apply to the latest Facets release(s).
kevinquillen → made their first commit to this issue’s fork.
Patch no longer applies, the issue is marked fixed but is this really fixed in 3.x?
Ran into this same issue. Instead of put it in the callback though, I put the change in my buildForm.
I wound up having to drop this approach largely because Domain Config UI relies on the use of reading and writing to $_SESSION which does not work on managed hosts like Acquia. Coupled with the fact that the request object isn't always populated (I assume this is a result of the redirect commands in ajax forms?) makes using the Session service really difficult (internally it also relies on Request object existing).
Instead, I made a normal form in place of the maintenance mode form with a select field for domains and handled the save method myself, with the same service from the proposed patch.
You can now change the model and sim function on the processor config. The vector dimension is inferred from the provider plugin and settings are updated on the field type. Editing field type defs this way does not feel that good, but its all I can do at the moment.
kevinquillen → created an issue.
kevinquillen → created an issue.
kevinquillen → created an issue.
kevinquillen → created an issue.
kevinquillen → created an issue.
kevinquillen → created an issue.
I've published some of the combined work here until its supported in Search API Solr:
https://www.drupal.org/project/search_api_solr_dense_vector →
Hacking around on the current changes, I was able to get to this point. I need to populate my site with more than test content to do more tests. I rewired to use AI provider and put a setting on the index to let you pick which provider to use and what the vector dimension size should be (currently doesn't affect the field type). I pull the embeddings model from default AI settings, and updated the field dimension size in the UI and uploaded the config sets to Solr. I put a dense vector field on the index for Title, and indexed it. Still needs a decent amount of work for smoother UI setup.
kevinquillen → made their first commit to this issue’s fork.
kevinquillen → created an issue.
nicxvan → credited kevinquillen → .
The logic of the job looks sound to me.