- ๐ท๐บRussia walkingdexter
@gbyte I agree with #16. Meta information is stored in the database, most likely the proposed change will create memory issues.
This hook working during generation but not during queue creation might be confusing for implementers ("Why do only some entities go through this hook?"). This could be circumvented by removing the following from EntityUrlGenerator::getDataSets thus delegating access checks completely to the generation process
Not sure about this change because I don't fully know the queue processing logic.
But yeah, this sounds cleaner indeed. I'm not a big fan of hooks either hence this module's extensive API. Just wondering how much code overhead it will be to implement reacting to that event for e.g. the rabbit_hole module as opposed to a hook implementation.
And I'm not a big fan of mixing hooks and events :) I think we should choose one of these approaches for the whole API.
...take a look at the code and state your opinion about all of it
Is it a good idea to remove the subscriber definition when the rabbit_hole module is not installed? Maybe it's better to add the definition when the module is installed?
- Assigned to walkingdexter
- ๐ฉ๐ชGermany gbyte Berlin
@gbyte I agree with #16. Meta information is stored in the database, most likely the proposed change will create memory issues.
I believe right now it is only stored temporarily in \Drupal\simple_sitemap\Queue\QueueWorker::stashResults() along with the rest of the link data when the generation process does not complete during given time. You raise a valid concern however - we should remove all data that is not relevant for generation before it is being passed to the queue worker. I'll make a note. Let me know if you think meta information is stored in the database anywhere else.
And I'm not a big fan of mixing hooks and events :) I think we should choose one of these approaches for the whole API.
This is a fair point and seeing the big code overhead events create, I am reluctant to switch to it. Which one do you prefer for this module?
- Issue was unassigned.
- ๐ท๐บRussia walkingdexter
I believe right now it is only stored temporarily in \Drupal\simple_sitemap\Queue\QueueWorker::stashResults() along with the rest of the link data when the generation process does not complete during given time... Let me know if you think meta information is stored in the database anywhere else.
My concerns relate specifically to this method.
\Drupal\simple_sitemap\Queue\QueueWorker::stashResults()
uses the state system to store information. Default implementation of\Drupal\Core\State\StateInterface
uses the key/value store. Default implementation of\Drupal\Core\KeyValueStore\KeyValueStoreInterface
uses the database to store key/value data.\Drupal\Core\State\State
- default state implementation.
\Drupal\Core\KeyValueStore\DatabaseStorage
- default key/value store implementation.One way or another, in some cases the data is stored in the database during the sitemap generation. Storing heavy data (entity objects) should be justified and at least properly tested. At this moment, I think it more reasonable to store the entity ID and load the object on demand. Note that the entity static cache prevents the same entities from being reloaded in the same request.
This is a fair point and seeing the big code overhead events create, I am reluctant to switch to it. Which one do you prefer for this module?
Usually I prefer to use events because they are more flexible. If we want to use events, then we should introduce them in the next major update, since we cannot simply remove hooks due to backwards compatibility.
- ๐ง๐ชBelgium dieterholvoet Brussels
How about we create a new event and dispatch it, + we create a new hook which is invoked in an event subscriber. We immediately deprecate the hook and event subscriber to be removed in the next major release. That way, in the next major release, all that needs to be done is to remove the event subscriber and hook documentation. That also gives us more time to transition the other hooks into events.
- ๐ฉ๐ชGermany gbyte Berlin
My concerns relate specifically to this method. \Drupal\simple_sitemap\Queue\QueueWorker::stashResults() uses the state system to store information. Default implementation of \Drupal\Core\State\StateInterface uses the key/value store. Default implementation of \Drupal\Core\KeyValueStore\KeyValueStoreInterface uses the database to store key/value data.
\Drupal\Core\State\State - default state implementation.
\Drupal\Core\KeyValueStore\DatabaseStorage - default key/value store implementation.One way or another, in some cases the data is stored in the database during the sitemap generation. Storing heavy data (entity objects) should be justified and at least properly tested. At this moment, I think it more reasonable to store the entity ID and load the object on demand. Note that the entity static cache prevents the same entities from being reloaded in the same request.
I don't intend to be storing objects in the database; you seem to have missed this:
You raise a valid concern however - we should remove all data that is not relevant for generation before it is being passed to the queue worker. I'll make a note. Let me know if you think meta information is stored in the database anywhere else.
There is a ticket attached to this issue to make sure nothing like this happens: ๐ Check what data is being saved to database upon generation Active .
How about we create a new event and dispatch it, + we create a new hook which is invoked in an event subscriber. We immediately deprecate the hook and event subscriber to be removed in the next major release. That way, in the next major release, all that needs to be done is to remove the event subscriber and hook documentation. That also gives us more time to transition the other hooks into events.
Feel free to create a ticket to tackle this in the future. ๐ Shift entity access checks from queue to generation Active and passing entities to hooks seems like the simple non-breaking solution in the meantime. I'll take a look at this soon.
- last update
over 1 year ago 32 pass - ๐ง๐ชBelgium dieterholvoet Brussels
I updated the MR to make it compatible with Rabbit Hole 2, dropping support for Rabbit Hole 1.
- last update
over 1 year ago 32 pass - Status changed to RTBC
about 1 year ago 9:39am 19 February 2024 - ๐ง๐ชBelgium stijnstroobants Leuven
Reviewed and confirmed the patch is working as needed!
- First commit to issue fork.
- ๐ฏ๐ดJordan mohammad-fayoumi Amman
Reroll the MR !16 by creating a static patch file to work with the latest stable version 4.2.1
- ๐ท๐บRussia walkingdexter
My current thoughts:
- In this issue we should focus on adding the ability to act on a processed entity. I propose to move the event subscriber (or hook implementation) to the Rabbit Hole module by creating a separate issue. Otherwise, I don't think it's right to drop support for Rabbit Hole 1. It's stable, and we'll have to support both versions.
- The discussion about events and hooks is relevant again because of this change record โ .
- The proposed code must use all possible type declarations.
- All warnings from the CI pipeline must be fixed.
- Status changed to Needs work
about 1 month ago 8:50am 4 March 2025 - @walkingdexter opened merge request.
-
walkingdexter โ
committed 8092072e on 4.x
Issue #3087347 by dieterholvoet, walkingdexter, mohammad-fayoumi, rory...
-
walkingdexter โ
committed 8092072e on 4.x
- ๐ท๐บRussia walkingdexter
- Added
hook_simple_sitemap_entity_process()
to allow other modules to process the entity. - Replaced event with hook to provide consistency and reduce support overhead. Replacing hooks with events should be discussed in a separate issue.
- Support for the Rabbit Hole module must be implemented outside of this here module. It's difficult to create a general implementation that will satisfy all use cases. For example, the solutions presented here don't take into account user permissions that affect the behavior of the Rabbit Hole module.
- Added
- ๐ฉ๐ชGermany gbyte Berlin
@walkingdexter Thanks!
Questions:
All our alter hooks are called ..._alter(). Why did you decide to use invokeAll() instead of alter() and call the hook hook_simple_sitemap_entity_process()? We also usually pass the current sitemap for context as secondary parameter.
Also I'm flirting with the idea to allow for users to set the entity to NULL inside the hook to skip it instead of them throwing our exception. Or allow both.
Finally, in simple_sitemap.api.php there is an example of changing the entity's field, however we don't pass the entity to constructPathData.
Also note to myself to add this to readme.md.
- ๐ท๐บRussia walkingdexter
All our alter hooks are called ..._alter(). Why did you decide to use invokeAll() instead of alter() and call the hook hook_simple_sitemap_entity_process()?
Similar to entity hooks (load, update, insert, etc.). There is nothing to "alter" if a developer just wants to throw
SkipElementException
.We also usually pass the current sitemap for context as secondary parameter.
Yeah, missed that. Definitely needs to be added.
Also I'm flirting with the idea to allow for users to set the entity to NULL inside the hook to skip it instead of them throwing our exception. Or allow both.
I don't see much benefit from this change. It can be confusing as there will be two ways to do the same thing. I think it's better to use the exception.
Finally, in simple_sitemap.api.php there is an example of changing the entity's field, however we don't pass the entity to constructPathData. Does this still work?
It works :) Most likely, this is due to the static entity cache.
- @walkingdexter opened merge request.
- ๐ท๐บRussia walkingdexter
Added the current sitemap for context, removed the example of changing the entity's field as it is not obvious.