- Issue created by @jonathan_hunt
- π³πΏNew Zealand jonathan_hunt
I've dug further into the site where this is occurring and the underlying challenge is OCRed text being rendered into a field modelled as `basic_html`. When html_tag_usage analyses the site, it picks up this content and processes it, but in the subsequent report it is is trying to generate routes for html tags that are not valid tags, e.g.
<..:><code><..gt:>
causing errors in the route generation.The root cause is mine to resolve but this module could check whether the "tag" encountered is a..z, A..Z etc. characters before attempting to build a route.
- πΊπΈUnited States mortona2k Seattle
What do we want to do with bad html like this? Seems like it should have another report for it.
In the code that renders the table, we can check the tags/attributes for invalid options and show a message. But I found it more useful to change the route regex to allow the invalid html so we can see it in the report.
There is a note in the routing file that the attribute regex is not broad enough, but this issue is for the tag regex. I just added a : to both.
We can probably use the existing regex to check for valid html and then put the invalid tags in a separate group. We probably don't want to have them in the list at the bottom for copying into text filters, or at least call it out somehow for the dev to consider.
- πΊπΈUnited States mortona2k Seattle
mortona2k β changed the visibility of the branch 1.0.x to hidden.
- Status changed to Needs review
about 1 year ago 11:43pm 5 April 2024 - πΊπΈUnited States mortona2k Seattle
The patch fixes the specific error and lets the report load. I created a new issue for a better invalid html report.
- πΊπΈUnited States mortona2k Seattle
mortona2k β changed the visibility of the branch 1.0.x to active.
- πΊπΈUnited States mortona2k Seattle
mortona2k β changed the visibility of the branch 1.0.x to hidden.
- Status changed to RTBC
7 months ago 3:27am 17 September 2024 - π³πΏNew Zealand jonathan_hunt
Patch works for me (allows HTML tag usage report to load in the presence of invalid HTML).