- Issue created by @RoSk0
- Status changed to Needs review
almost 2 years ago 4:14am 16 May 2023 - 🇳🇿New Zealand RoSk0 Wellington
Here is a patch which fixes and issue. Together with test. Test passes locally , but it can't be run in CI as all the other tests in the module are outdated and broken.
- Status changed to RTBC
almost 2 years ago 8:58pm 23 May 2023 - 🇳🇿New Zealand ericgsmith
Patch looks good - fixes the error with duplicate IDs.
Test is good too - as noted by you it could be a unit test, I think given currently the tests are not running that this shouldn't prevent it being RTBC and a follow up issue can be created to get the tests running.
+++ b/tests/src/Kernel/TocBuilderTest.php @@ -0,0 +1,68 @@ + * @todo Convert to unit test and include in TocTest, once that is fixed.
- Status changed to Needs work
almost 2 years ago 9:13pm 23 May 2023 - 🇳🇿New Zealand ericgsmith
Sorry, was a bit premature on setting RTBC.
On closer look the test and code assumes that the ID precedes the generated ID, e.g
<h3>Documents</h3> <p><a id="documents"></a></p>
Will still produce duplicate IDs.
- Status changed to Needs review
almost 2 years ago 12:46am 25 May 2023 - 🇳🇿New Zealand RoSk0 Wellington
Thanks for the review Eric and picking that issue up!
Also spotted this.
When ToC API faces this type of HTML:
<p><a name="documents"></a></p> <h2>Documents</h2>
with now deprecated,
name
attribute.It produces this:
<h2 id="documents">Documents<a href="#documents" class="anchor">#</a></h2> <h2 id="documents">Documents<a href="#documents" class="anchor">#</a></h2>
and throws this warning:
Warning: Uninitialized string offset 1 in Drupal\toc_api\TocBuilder->renderContent() (line 75 of /app/web/modules/contrib/toc_api/src/TocBuilder.php)
So I added a workaround and test for that as well.
- Status changed to RTBC
almost 2 years ago 1:52am 25 May 2023 - 🇳🇿New Zealand ericgsmith
Nice work - looks good to me.
Ran test locally on D10 / PHP 8.1 and all green.
- 🇳🇿New Zealand RoSk0 Wellington
Nice , thanks!
I was testing on D9.5, PHP 8.0 so we have a good test coverage!
- Status changed to Needs review
over 1 year ago 11:14pm 26 June 2023 - 🇳🇿New Zealand RoSk0 Wellington
We've identified a huge performance hit with introduced ID deduplication code on larger HTML documents. For the context, the documents we were testing this are around 5 Mb of plain HTML and are very complex , with lots of tables , headings and ID attributes.
Rendering the page in development environment was taking around 15 min to render and failing at times.
With this new approach based on XPath filtering , rather than whole DOM processing, it is now way more performance and we saw results of under 60 seconds to render on a cold cache.
Based on the XPath spec:
The resulting node sequence is returned in document order.
So we shouldn't loose heading order here. Tests to prove this would be nice to have surely, I don't have capacity for that at the moment.
- Status changed to RTBC
over 1 year ago 12:07am 27 June 2023 - 🇳🇿New Zealand ericgsmith
Awesome work! Significant improvement when dealing with large content.
- 🇳🇿New Zealand RoSk0 Wellington
We have this patch running in production since 2023-06-27 with no negative effects found to date.
-
VladimirAus →
committed ebd63a54 on 8.x-1.x authored by
RoSk0 →
Issue #3360540 by RoSk0, ericgsmith: Duplicated HTML IDs
-
VladimirAus →
committed ebd63a54 on 8.x-1.x authored by
RoSk0 →
- Status changed to Fixed
over 1 year ago 12:28pm 24 November 2023 - 🇦🇺Australia VladimirAus Brisbane, Australia
Thank you! Commited and released! 🍻
Automatically closed - issue fixed for 2 weeks with no activity.