I disagree with closing.
The problem is still there - loading all items at once.
It should be split into chunks or batches.
@dmitry.korhov Ok. Please provide a detailed explanation of how to see the benefit of the proposed changes. Also keep in mind that with these changes half of the links will not be included in the sitemap.
If you look closely at the current implementation of yieldItem()
, you will see that the items are not loaded all at once, because we use fetchObject()
. In the case of MySQL, the important thing is buffered queries. However, in my testing I found no correlation with memory consumption (Drupal 10.4, PHP 8.1, MySQL 8.0). The proposed changes also have no effect.
It's possible that the improvement has already been implemented at the core level. See 🐛 Implement statement classes using \Iterator to fix memory usage regression and prevent rewinding Fixed for details.
This is not an issue anymore (tested locally with performance_test.php
). Yeah, the memory usage grows with the amount of data in the sitemap, but I couldn't find any correlation with yieldItem()
, even if we select one element at a time from the queue. It must be related to something else.
@walkingdexter Would you mind documenting this ability in readme.md?
Done.
@gbyte The changes look good to me, please review.
Added the current sitemap for context, removed the example of changing the entity's field as it is not obvious.
Closing due to lack of activity. Feel free to reopen if you can provide more information.
Closing due to lack of activity. Feel free to reopen if you can provide more information.
Closing due to lack of activity. Feel free to reopen if you can provide more information.
All our alter hooks are called ..._alter(). Why did you decide to use invokeAll() instead of alter() and call the hook hook_simple_sitemap_entity_process()?
Similar to entity hooks (load, update, insert, etc.). There is nothing to "alter" if a developer just wants to throw SkipElementException
.
We also usually pass the current sitemap for context as secondary parameter.
Yeah, missed that. Definitely needs to be added.
Also I'm flirting with the idea to allow for users to set the entity to NULL inside the hook to skip it instead of them throwing our exception. Or allow both.
I don't see much benefit from this change. It can be confusing as there will be two ways to do the same thing. I think it's better to use the exception.
Finally, in simple_sitemap.api.php there is an example of changing the entity's field, however we don't pass the entity to constructPathData. Does this still work?
It works :) Most likely, this is due to the static entity cache.
@baikho thanks for the issue!
I like the idea, but not the implementation. This should be implemented similar to the Metatag module (per-entity overrides).
Looks like this is already in 4.x. Otherwise, a description must be added.
Created an issue in the Monitoring module and adapted the proposed code. See ✨ New sensor: Simple XML Sitemap Active for further work.
Please also save credits for those who worked on the original issue if the proposed changes are accepted.
walkingdexter → created an issue.
This change should be mentioned in the release notes. The previous behavior can be restored using hook_entity_query_tag__simple_sitemap_alter()
or hook_entity_query_tag__ENTITY_TYPE__simple_sitemap_alter()
.
Saved credits for @dieterholvoet as he participated in the discussion of this change in ✨ Discard adding if rabbit hole setting is active Needs review .
- Added
hook_simple_sitemap_entity_process()
to allow other modules to process the entity. - Replaced event with hook to provide consistency and reduce support overhead. Replacing hooks with events should be discussed in a separate issue.
- Support for the Rabbit Hole module must be implemented outside of this here module. It's difficult to create a general implementation that will satisfy all use cases. For example, the solutions presented here don't take into account user permissions that affect the behavior of the Rabbit Hole module.
Thanks for the issue! The proposed resolution is a breaking change, so I just fixed the info files.
walkingdexter → made their first commit to this issue’s fork.
Merged, thanks!
walkingdexter → made their first commit to this issue’s fork.
@sokru I can't reproduce the specified error with Drupal 10.4.1 and Drush 12.5.3. Please provide more information about your environment.
We turned off path processing in order to remove language prefixed paths. Instead of having two half-solutions, we should find a way to disable these language prefixes without breaking aliases.
The original problem is already fixed in Drupal 10.3 🐛 Language negotiation breaks updating Drupal 9 to 10 Needs work . So the change from 🐛 Unexpected language prefixes on sitemap index Fixed can be reverted, but the core version requirement should be increased to 10.3. I think we don't have to worry about it because 10.2 is no longer supported. Just tested all the changes and everything works fine, including path aliases.
Please provide more information about the problem, steps to reproduce and proposed resolution. The current summary doesn't provide enough details to understand the situation.
Already implemented in 4.x.
walkingdexter → made their first commit to this issue’s fork.
I don't use commerce - what would the canonical link of a product variation be and why does it not have a canonical link template?
Product variation URLs depend on the parent product:
http://example.com/product/{commerce_product}?v={commerce_product_variation}
https://www.drupal.org/project/commerce/issues/2674888 →
https://www.drupal.org/project/commerce/issues/3025860 →
Do you mean to allow all content entities and then try-catch around the
$entity->toUrl();
block if no URL is returned? Feel free to experiment with this.
I thought about it, but now I think it's better to use a simple solution. In a quick search, I couldn't find any other examples similar to product variations. So for now we have one entity type that requires special treatment.
@matthewv789 I can't reproduce this problem on a fresh install. Try composer why-not drupal/simple_sitemap 4.2.2
or composer require "drupal/simple_sitemap:4.2.2" --dry-run
to get more details. Most likely the problem is related to your environment or Composer configuration.
@aarantes can you reproduce the problem on a clean install? For now it looks more like a problem with your environment.
If someone needs a patched 1.6.0.
The problem described here causes an error in Drupal 11. I think that's enough to bump the priority.
TypeError: Drupal\Component\Utility\Html::escape(): Argument #1 ($text) must be of type string, null given, called in /var/www/core/lib/Drupal/Component/Render/FormattableMarkup.php on line 238 in Drupal\Component\Utility\Html::escape() (line 431 of /var/www/core/lib/Drupal/Component/Utility/Html.php).
Rerolling the patch for 11.1.
walkingdexter → created an issue.
The previous patch doesn't work with Composer due to version string. This one should work.
Rerolling the patch for 2.0.0
@iseeaflyingcrane Thanks for the issue! This is a valid point.
The reason for this behavior is that product variation does not have a canonical link template. However, if an entity doesn't have a canonical template, it can still have a canonical link. EntityHelper::supports()
excludes entities that can be successfully added to the sitemap.
@gbyte Maybe we should support all content entity types? We can move entity types without a canonical link template to a separate table on the /admin/config/search/simplesitemap/entities page and warn users about possible errors.
@gbyte Yes, it makes sense.
The desired result can be achieved by using hook_entity_query_tag__TAG_alter
or hook_entity_query_tag__ENTITY_TYPE__TAG_alter
, where TAG
is simple_sitemap
. For example, for nodes the hook will be hook_entity_query_tag__node__simple_sitemap_alter
. In this hook you can add a date condition to the query.
The above feature was added in ✨ Allow to alter entity url generator query Needs review and will be available in the next release.
See related issues for details.
Already fixed in ✨ Add Senzam search engine to IndexNow functionality Fixed . Marked this also as fixed to save credits.
Thanks for the clarification. This seems to be a very specific use case. If I understand correctly, the problem can be solved with a few lines of code using hook_simple_sitemap_links_alter()
. Is this true?
I can't reproduce the problem by following these steps with Menu Item Extras 3.1.0. Please provide more information.
I believe this issue 🐛 option argument 'bundle' sometimes has a NULL value, which can cause issue Fixed covers a similar state, and they proposed to solve it by falling back to the entity type ID.
In this case the problem is more obvious, as the documentation says that FieldDefinitionInterface::getTargetBundle()
can return NULL.
@WalkingDexter @dpi please check out the gitlab review. Let's not merge this before discussing if this should instead go into a 3rd party simple_sitemap_monitoring module which I feel should be the way to go ATM.
I think this functionality can be added directly to the Monitoring module. It already contains plugins for other contrib modules.
Merged, thanks!
This is not a breaking change because PHP is case insensitive for the class names. However, in some environments, errors may occur due to file name changes. See 💬 Class 'Drupal\simple_sitemap\Queue\SimpleSitemapQueue' not found Fixed for additional information. This should be described in the release notes.
Need more information about the use case. Why can't this be solved with the "Excluded languages" setting?
The proposed solution should not replace the current routing. It should be an option.
I don't see any practical use for this feature. This array is not intended to be modified. Feel free to reopen if you can provide a use case.
Need more information about the use case and the steps to reproduce.
The array of sitemap links should not be modified inside the addChunk()
method.
8.x-3.x is outdated, it's unproductive to spend time on it. MR needs to be reworked for 4.x.
The necessary hooks are already in the core. All we need to do is add tag and metadata to the query.
walkingdexter → created an issue.
Fixed, thanks!
I can't reproduce the problem on a clean install. Please provide steps to reproduce.
Committed, thanks!
ProcessOutbound Case:
Create a view with the URL of foo/bar/sitemap.xml. Try to go to this URL and you will be redirected to /bar/sitemap.xml.
FYI, that's not how it works. Outbound processors cannot be the cause of a redirect.
Need steps to reproduce. The queue is cleared before rebuilding.
Feel free to reopen if you have a problem with multisite installations.
The current solution is correct. The problem must be fixed by a third party #3228568: LanguageNegotiationCountryPathUrl plugin not returned on Language Negotiator getNegotiationMethods() call → .
Not related to 4.x
, as sitemap types are not plugins.
walkingdexter → created an issue.
walkingdexter → created an issue.
The sitemap protocol still contains <priority>
and <changefreq>
values.
Closing due to lack of activity.
Closing due to lack of activity.
This feature conflicts with ✨ Override a non-indexed bundle on a per-entity basis Needs work and therefore cannot be implemented.
walkingdexter → created an issue.
Fixed errors and merged, thanks! We can't use autoconfigure
for now because it would break backwards compatibility.
walkingdexter → made their first commit to this issue’s fork.
Merged, thanks!
walkingdexter → made their first commit to this issue’s fork.
Just realized that this is a duplicate of ✨ change sitemap.xml?page=1 to sitemap1.xml Needs review .
My current thoughts:
- In this issue we should focus on adding the ability to act on a processed entity. I propose to move the event subscriber (or hook implementation) to the Rabbit Hole module by creating a separate issue. Otherwise, I don't think it's right to drop support for Rabbit Hole 1. It's stable, and we'll have to support both versions.
- The discussion about events and hooks is relevant again because of this change record → .
- The proposed code must use all possible type declarations.
- All warnings from the CI pipeline must be fixed.
When a developer decides to use DATE_ISO8601
, he will see that DATE_ISO8601
is deprecated and DATE_ATOM
should be used instead. However, the <lastmod>
date can be formatted in different ways and ATOM format is not the only option.
@introfini Sorry for the late response. All you need to do is add a custom submit callback to the node form and change the $form_state
values related to the sitemap settings before they are saved.
There is no need to divide sitemaps by language. You can simply divide them by content type. URLs for other languages will be automatically included.
The sitemap page itself is not supposed to be indexed. See related issues for details.
The sitemap page itself is not supposed to be indexed. See related issues for details.
This looks like a feature request.
@damienmo The following may help you:
- Set the "Maximum links in a sitemap" limit or reduce it if it's already set.
- Turn off the "Exclude duplicate links" feature. With large numbers of links, this feature will lead to heavy SQL queries and memory issues.
If that doesn't help, try to find out what the largest data is.
Merged, thanks!
@handkerchief In your scenario I still get the 404 error. Can you provide more details on step 3?
- Is the Content Translation module disabled?
- What settings are set on /admin/config/regional/content-language?
- What settings are set on /admin/config/regional/language/detection?
I tested the following scenario on a clean install with 4.x-dev
:
- Enable two languages - English and non-english.
- Set English as default language.
- Use the "Selected language" plugin for language detection and configure it with a non-english language.
- Enable content translation for the "Article" content type.
- Add a new node of type "Article" and translate it into a non-english language.
Now when I set a URL alias for the created node, I get a 404 error if the alias has a non-english language. If I change the alias language to English or unspecified, then the 404 error does not occur. The sitemap is also correct in this case.
So I can't reproduce the problem on a clean install. Maybe I'm missing something. Feel free to reopen if you can provide more information to reproduce the problem. In other cases see #4 and #5 for possible solutions. Also consider changing the URL alias language.
Merged, thanks!
I can't reproduce the problem from #4 - different sitemap results for different languages. This has probably been fixed in other issues. Speaking about the original problem, the proposed resolution is incorrect. If I understood the example correctly, then https://www.drupal8.loc/de/OeffentlicheSeite
and https://www.drupal8.loc/en/node/401
are the URLs of the same node. In this case, the specified result is expected. This is exactly how the "Skip non-existent translations" functionality is designed.
Just tested, all bugs reported here have been fixed in 🐛 Unexpected language prefixes on sitemap index Fixed