- 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
It should be a Search API processsor like the Content Access
- last update
over 1 year ago Composer require-dev failure - @eduardo-morales-alberti opened merge request.
- last update
over 1 year ago Composer require-dev failure - Issue was unassigned.
- 🇦🇹Austria drunken monkey Vienna, Austria
Thanks for starting work on this!
When the code is ready for review (optimally already including automated tests), please remember to post a patch so the test bot can test it. (Unfortunately, it doesn’t work for issue forks in this project, see #3190024: Problem with test dependencies when testing issue forks → .) - Status changed to Needs work
over 1 year ago 7:01am 16 May 2023 - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Ok @drunken monkey, work in progress by the moment
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
First version ready I will upload the patch
- last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Process unpublished filters from facets "extra_data" correctly.
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Add doc comment to method getUnpublishedEntitiesFromValues and define an undefined variable.
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Do not extract fields on the process result, only returns the fields that were previously set or extracted.
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Not necessary to set the field on the result item because the field class maintain the reference.
- last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
After a lot of testing and development, seems that changing the result on the post-result process is not a valid solution because the field values are not available, and the way to get them alters the result.
Using the result item, the get fields method, will return null if they were not extracted yet, and if you force the extraction it will alter the result, so I will try to develop it for the indexing stage instead of the post-result process stage.
https://git.drupalcode.org/project/search_api/-/blob/8.x-1.x/src/Item/It...
/** * Returns the item's fields. * * @param bool $extract * (optional) If FALSE, only returns the fields that were previously set or * extracted. Defaults to extracting the fields from the original object if * necessary. * * @return \Drupal\search_api\Item\FieldInterface[] * An array with the fields of this item, keyed by field identifier. */ public function getFields($extract = TRUE);
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
I will try to cover it on the indexing instead of the postprocess:
- Create a Search API field processor - Ok
- Reindex item when the taxonomy status change
- Use a cron process to do it
- last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - Status changed to Needs review
over 1 year ago 8:00am 21 June 2023 - last update
over 1 year ago 536 pass, 7 fail - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Added new patch.
Steps to test:- Have an index with content with referenced entities
-
Enable the processor with label "Exclude field references entities unpublished from results."
- Configure the processor, you should choose the fields and the content (can be aggregated fields)
- As a recommendation, add the field status from the referenced entity on the index, for example field_category:entity:status, this will reindex the content when the reference entity changes its status.
- Show the referenced field on an indexed view
- Reindex all items
- Check on the view that the unpublished entities are not present
- Check it with different entity types, languages and status
- last update
over 1 year ago Composer require-dev failure The last submitted patch, 21: search_api-allow_exclude_unpublished_references-3114792-21.patch, failed testing. View results →
- Status changed to Needs work
over 1 year ago 8:09am 21 June 2023 - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
I am not sure why the test failed, any clue @drunken monkey?
- last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago Composer require-dev failure - Status changed to Needs review
over 1 year ago 10:55am 26 June 2023 - last update
over 1 year ago 544 pass - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Excluded also the not translated entities.
- 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Track changes in references track all translations which is not efficient, the performance can be improved if it tracks only the updated language translation or all translations if the updated entity is the original translation.
- last update
over 1 year ago Composer require-dev failure - last update
over 1 year ago 544 pass - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Track only the changed translation if is not the default translation to index faster the stale entities.
- last update
over 1 year ago Composer require-dev failure - 🇪🇸Spain eduardo morales alberti Spain, 🇪🇺
Get the original translation on the TrackerHelper.php on the method trackReferencedEntityUpdate.
// Original entity, if available. $original = $deleted ? NULL : ($entity->original->getTranslation($entity->language()->getId()) ?? NULL);
Because on the ContentEntity::getAffectedItemsForEntityChange will try to compare the original entity with the current entity and can be a translation.
// We trigger re-indexing if either it is a removed entity or the // entity has changed its field value (in case it's an update). if (!$original_entity || !$items->equals($original_entity->get($relation_info['field_name']))) {
- last update
over 1 year ago 389 pass, 98 fail The last submitted patch, 30: search_api-allow_exclude_unpublished_references-3114792-30.patch, failed testing. View results →
- last update
over 1 year ago Composer require-dev failure - Status changed to Needs work
over 1 year ago 8:11am 13 July 2023 - last update
over 1 year ago 544 pass - Status changed to Needs review
over 1 year ago 8:44am 13 July 2023 - 🇫🇷France xavier.masson Haute-Normandie
I've tried the patch but doesn't fully work as expected as the patch only work with node entity type instead all the entity types fieldables. I'll try to find time to add support on all the entities types.
- Status changed to Needs work
about 1 year ago 8:51am 24 September 2023 - 🇩🇪Germany mkalkbrenner 🇩🇪
It is great to see some movement here.
But beside the fact that the current patch only supports nodes, there're more issues.
The processor seems to fail if the field isn't just a referenced entity, but a field of that entity.
And the processor could not handle deeply nested references like entity 1 references entity 2 which references entity3.I think both issues could be solved if the processor "simply" checks the status of all entities along the property path.
- Status changed to Needs review
about 1 year ago 1:35pm 10 October 2023 - last update
about 1 year ago Patch Failed to Apply - 🇩🇪Germany mkalkbrenner 🇩🇪
Here's a proof of concept. Using a processor that acts on the entire property path isn't easy.
I created a patch that checks the published state for all entities across a property path as soon as the EntityStatus processor is activated. - last update
about 1 year ago Patch Failed to Apply - last update
about 1 year ago 541 pass, 2 fail - 🇦🇹Austria drunken monkey Vienna, Austria
Thanks a lot, Markus, that does already look promising!
Attached is a re-roll that applies to the latest dev version of the module.The main problem with this approach is, of course, that it’s super unclean – we add code specific to a single processor to a framework class, even to a service, and now have to pass the index in various places where it shouldn’t be needed. However, passing anything but the index would be even worse, of course, since then the logic of when we want unpublished entities excluded would have to be determined in multiple places in the code – and this place in the code is definitely the best suited, by far, to implement this change, you’re abolustely correct there.
The probably cleanest way of implementing this would be to add a general method for processors to influence the field extraction process, but that’s of course way overkill just for solving this one problem. So, not really sure how to proceed. I’ll have to think about the least awful way forward here, I guess.Anyways, your patch is already very helpful, thanks again!
If others interested in this issue could give it a try, that would be great. Even better would be if someone could already write some tests, since those will be useful even in case we go with an entirely different approach (which I don’t think likely, but still). The last submitted patch, 42: 3114792-42--allow_excluding_unpublished_references.patch, failed testing. View results →
- codesniffer_fixes.patch Interdiff of automated coding standards fixes only.- last update
about 1 year ago 545 pass - 🇩🇪Germany mkalkbrenner 🇩🇪
Even if this isn't a beautiful architecture, I would like to see that code to be committed. We could comment it and declare it as intermediate solution to solve this issue.
We use this patch in production. It solved several issues in our searches. For example nothing was highlighted because the snippet belongs to some unpublished content. Or a facet item that leads to no results because the raw value is only used by not yet published content.
- Status changed to Needs work
about 1 year ago 11:46am 29 October 2023 - 🇦🇹Austria drunken monkey Vienna, Austria
OK, unless I can think of a better solution I will commit the current one, at least as a temporary workaround.
However, this still needs tests either way, so NW for that. - 🇷🇺Russia qzmenko Novosibirsk
We need to check if
$entityAdapter
is not null. It can be null if target entity not exists.Right now I recieve error during reindex:
Error: Call to a member function getValue() on null in function Drupal\search_api\Utility\FieldsHelper->extractFields() (line 136 / /app/docroot/modules/contrib/search_api/src/Utility/FieldsHelper.php).
Sorry, I don't have possibility to create an interdiff right now.
Patch attached. Ok so in the patch, that I attach, I got quite a few suspect things, so I will start with some disclaimers.
Disclaimer 1: Please do not use this patch in production, unless you are sure it will not cause you any of the problems I will mention below. I have on purpose hidden the patches from display.
Disclaimer 2: Some existing tests will fail, I have not taken the time to maintain them yet, since I would like some feedback first.
With those out of the way, here is what I did.
@qzmenko I based my patch on #44 and not on #47 because it is not clear to my when/why you would have a non existent entity if a reference to it exists.
I am sure there are valid cases, I just don't have any in mind atm, and I am not sure how we should handle those cases. In #47 you simply let the FieldHelper do as it would before, which means index the fields normally, which makes no sense to me if the referenced entities, for said field, do not exist.@mkalkbrenner With #44 which is basically #41, The described functionality worked fine for nested field
variations:entity:whatever:whatevercode
but not for direct fields e.g. as simple taxonomy fieldfield_brand
.
In my understanding, the first timeextractFields
is called, theComplexDataInterface $item
is the actual indexed entity. We don't care about that since it will be filtered out later, or should have already been filtered by the actual processor (Not sure atm at what point this happens, but does not rly matter). Then for that entity, the selected fields to index are split between nested and direct fields. For nested fields we have a recursive call, which then filters those fields based on status.
For direct fields we don't do that, insteadextractField
is called. I extracted your logic in a helper function and used it inextractField
as well... and here I had some thoughts.@drunken monkey I refrained from injecting the
$index
variable to theextractField
function, since it already hasFieldInterface $field
as an argument, and we should be able to extract the index from there. I had to pass the index to theextractFieldValues
function though, which does the actual work. Now here I got some issues.
This funciton is used by stuff I could not test atm in my code, and I think would have weird side effects.E.g. It is used in
AddHierarchy
processor. I don't know what will happen if you stop indexing something that has published children. How do we handle the children? It would make sense to stop displaying them if their ancestor is not displayed, but maybe some would argue that it's ok to keep showing them without the ancestor. I don't know how something like that is already handled in the processor, and do not atm have a working environment with matching data to test this, so I made the index variable nullable inextractFieldValues
and my helper function, which leads extractFieldValues calls to gracefully keep doing what they were doing until now.
Maybe I am overthinking it, maybe not, idk.Same for the SearchApiFieldTrait::898 in `extractPropertyValues`... Did not handle that, since I am not sure If we want to.
We could either decide for them here, or handle those cases in their separate follow ups, I am not so familiar with the entire codebase yet.Lastly I completely ignored existing tests for the
extractFieldValues
, until we decide if this is a right approach. Give me some feedback, and I could get to them after.TLDR:
- Stuff on #47 were not clear to me, we should address this point, but I left it our for now.
- Direct fields were not handled by the proposed logic only nested... Think I cover this in the attached patch now.
- Did not cover every call of theextractFieldValues
function, to avoid possible side-effects.- Merge request !139Resolve #3114792 "Exclude unpublished references" → (Open) created by drunken monkey
- 🇦🇹Austria drunken monkey Vienna, Austria
Thanks a lot for your work on this!
Since we switched this module to use issue forks a few months ago, I created a new MR with the last two patches, plus a little clean-up by myself.In general, I do like your approach, so please feel free to work further on it. Once we have tests, we can hopefully also verify that #44 really didn’t cover some cases which your patch does.
Some more remarks:
- While I like the separate
shouldExtractReference()
method, I don’t think it has to be on the interface. It can just beprotected
, in my opinion. (Also causes less trouble with backwards-compatibility.) - Very good point about
$field
already containing a reference to an index. I think we should also use that approach to avoid the explicit$index
parameter forextractFields()
, if possible. Now that I think about it, it’s really impossible to add it as anything but the last parameter, so we’d have needed to change that anyways. (Might cause issues with some of the callers, though – let’s see.)
Anyways, still in desperate need of test coverage.
- While I like the separate
- 🇦🇷Argentina drupal-son
I have this exact need.
I'm using JSONAPI on an index indexing nodes with field realation to paragraphs.
Some paragraphs could be unpublished and I would like to JSONAPI avoid representing these unpublished node on the result.FYI - As the consumer user don't have permissions for reading unpublished content, then JSONAPI only returns the entity type and the UUID without any other extra information, but at on the `meta` -> `omitted`section JSONAPI adds references of entites returned that don't have enough permissions on the `data` section.
I'm trying to skip search api to index the unpublished related content, then JSONAPI not returning unpublished entities and no mention about them on the `omitted` section.
Is the latest patch file tested or ready to be used on prod or merged into https://www.drupal.org/project/search_api/releases/8.x-1.x-dev → ?
- 🇦🇷Argentina drupal-son
@eduardo-morales-alberti @qzmenko @drunken-monkey @mkalkbrenner @marios-anagnostopoulos
I tested the patches #44 and #48 on the search_api 8.x-1.34 (not 1.35 due to https://www.drupal.org/project/search_api/issues/3454939 🐛 Declaration of Drupal\search_api_solr\Plugin\search_api\backend\SearchApiSolrBackend::__sleep() must be compatible Active ) running on Drupal 10.2.5 using Acquia Search API Solr Server.
I have a search api index with multiple entity reference fields (pointing to content types and taxonomies).
BTW, is Paragraph entity type supported?
I have the option "Track changes in referenced entities" enabled.
I still can't see the "Exclude field references entities unpublished from results." processor on my index.Does it matter what Search API server is used?
What I am missing? - 🇳🇿New Zealand jonathan_hunt
The patch in #48 works for me on Search API 8.x-1.35 but since my use case is to avoid indexing unpublished media entities referenced from a node, I needed to alter
shouldExtractReference()
to check forEntityAdapter
, not justEntityReferenceItemInterface
, e.g.:public function shouldExtractReference(TypedDataInterface $reference_item, ?IndexInterface $index = NULL): bool { if ($index?->isValidProcessor('entity_status')) { if ($reference_item instanceof EntityAdapter) { $referenced_entity = $reference_item->getEntity(); } elseif ($reference_item instanceof EntityReferenceItemInterface) { $referenced_entity = $reference_item->get('entity') ->getTarget() ->getValue(); } if ($referenced_entity instanceof EntityPublishedInterface) { return $referenced_entity->isPublished(); } elseif ($referenced_entity instanceof UserInterface) { return $referenced_entity->isActive(); } } return TRUE; }
- 🇦🇹Austria drunken monkey Vienna, Austria
@jonathan_hunt: Thanks for the feedback, did not know that this could be a direct
EntityAdapter
instance. But should be no harm in just checking for that, too – thanks.Still needs test coverage, please.