- 🇺🇸United States nmillin
@itmaybejj have you been able to make some progress on this roadmap?
In looking around, it seems like a custom test could do link checking - https://editoria11y.princeton.edu/configuration/#customtests - but that would be a different way of checking links vs leveraging the linkchecker module.
Thanks!
- 🇺🇸United States itmaybejj
Much progress! 1-3 are done, and yes some users have already used them to write connectors for the library in other CMSs.
I definitely will not be writing my own link checker. It's just too far outside my expertise, and my employer already pays for a vendor tool.
The next thing I'm going to work on is rewriting my database tables to have more robust node references. If I get that working it should make connecting with other modules via the dashboard initiative much easier.
It will probably take me a year to write all that myself, but the pieces are in place for someone to contribute connectors with specific modules quite easily before then if they have the time; I'd be happy to show them how to do it, and I have some code samples users have sent to me over the years.
- 🇺🇸United States nmillin
Thanks @itmaybejj . I would be interested in looking at the code samples you have. I'll DM you my email if that is easier to share code samples.
We have SiteImprove for link checking, but I'm researching other options to make the content authoring experience better and have Drupal reports.
- 🇺🇸United States itmaybejj
Longer answer sent privately, but for the record:
For Drupal, I would want to filter it to relevant things for the current page, like I do for the dismissals on the page.
That array is just slapped into drupalSettings at line 343.For broken links, the first thing I would try is building a selector – performance testing would be needed of course. E.g., ‘a[href=”example1”], a[href=”example1”]’. And then pass that to the native finder function:
Ed11y.findElements('error404','a[href="example1"], a[href="example2"]'); Ed11y.findElements('error403','a[href="example3"], a[href="example4"]');
The rest would follow the pattern from the safeLinks test -- add each of the found items to the results array, and provide the various strings needed to build each type of tip.
Honestly – the Editoria11y side of this is really easy for me. The hard part for me would be rooting around in LinkChecker’s tables to shove the list of broken URLs on a route into drupalSettings somewhere. If someone took that on, I could do the rest.
- 🇺🇸United States itmaybejj
Oh I'll also say since you mentioned SI -- I am also looking into doing this using the API from our third-party solution. E.g., piping the broken links it found into the Editoria11y interface, so editors can view all their issues from any source via Editoria11y's UI.
Ideally I would love to just have a bucket of connectors in the module -- LinkChecker, SiteImprove, DubBot, etc. There would need to be a lot of filtering though -- a lot of issues from the "bigger" vendors are irrelevant to content editors. So I'd only want to pipe in a few relevant tests.
- 🇺🇸United States timwood Rockville, Maryland
Can the JS follow a link as/from the browser in the background and then report back rather than trying to integrate with Linkchecker? That way all user client authentication/cookies/sessions could be leveraged when checking the links.
- 🇺🇸United States itmaybejj
It can be done for sure -- just make AJAX calls and report errors...but it would be less effective and more work in my opinion -- links often go stale on pages you are not viewing, so link checking is usually done by crawling rather than live checking. And then to keep the load down on the server and the browser, I'd want to create a table of every found link and when it was last checked...
- 🇺🇸United States itmaybejj
Although if I took that route -- I have often thought about creating a browser-based simple crawler for Editoria11y, that just creates a grid of iframes and runs through a sitemap, with a shared Worker thread in each iframe phoning home to tell the controller when it has finished loading so the next URL can be requested. That would allow for any number of browser based checkers (e.g. AXE core) to run without hitting the server. But oh goodness the load it would create, as every request would be uncached and include JSON hits with the result.
- 🇺🇸United States timwood Rockville, Maryland
Could the link checking only happen on link insertion or save/preview submit or something? It might be external links too.
- 🇺🇸United States itmaybejj
Sure, if you're thinking more along the lines of input validation than maintenance down the line.
- 🇺🇸United States nmillin
@itmahybejj re: Honestly – the Editoria11y side of this is quick and easy for me. The hard part for me would be rooting around in LinkChecker’s tables to shove the list of broken URLs on a route into drupalSettings somewhere. If someone took that on, I could do the rest.
I'm going down this rabbit hole and will post back here in the next week or two.
Thanks for the insights!
- 🇺🇸United States nmillin
@itmaybejj so I have something for you to look at. I was able to get the data to drupalSettings. Is this something you could work with?
/** * Implements hook_preprocess_HOOK(). */ function HOOK_preprocess_node(&$variables) { // Exit if user does not have "view" permission. if (!Drupal::currentUser()->hasPermission('view editoria11y checker')) { return; } // @todo this should probably be tweaked depending on what drupal hook // is used. if (isset($variables['node']) && $variables['view_mode'] == 'full') { // Get the node ID. $nid = $variables['node']->id(); // Found that linkcheckerlink is an entity that we can query. $links_entity_manager = \Drupal::entityTypeManager()->getStorage('linkcheckerlink'); $entity_ids_for_node = $links_entity_manager->getQuery() ->accessCheck(FALSE) ->condition('parent_entity_type_id', 'node') ->condition('parent_entity_id', $nid) ->condition('code', NULL, 'IS NOT NULL') ->execute(); /** @var \Drupal\linkchecker\Entity\LinkCheckerLink[] $links_for_node*/ $links_for_node = $links_entity_manager->loadMultiple($entity_ids_for_node); $linkchecker_data = []; foreach ($links_for_node as $linkcheckerlink) { $linkchecker_data[] = [ 'url' => $linkcheckerlink->get('url')->getString(), 'code' => $linkcheckerlink->get('code')->getString(), ]; } // Pass the data on to JavaScript to handle the rest. $variables['#attached']['drupalSettings']['editoria11y']['linkchecker'] = [$linkchecker_data]; } }
- 🇺🇸United States itmaybejj
Yeah that might do it. I'm on project work for the next week but I'll give it a spin when I come up for air.