Use markdown-link-check to verify .md links

Created on 10 December 2024, 7 months ago

Problem/Motivation

We had an internal link in a doc page which had the incorect path.
Fixed in https://git.drupalcode.org/project/gitlab_templates/-/merge_requests/281...

External links could also be incorrect. Or may be valid at the time of writing, but subsequently the target url is moved or becomes unreachable.

Proposed resolution

Investigate if we want to use markdown-link-check to validate the links in our documentation pages.
https://github.com/tcort/markdown-link-check

Remaining tasks

Feature request
Status

Active

Component

gitlab-ci

Created by

🇬🇧United Kingdom jonathan1055

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jonathan1055
  • 🇬🇧United Kingdom jonathan1055

    Markdown linting is being discussed on 🌱 Adopt CommonMark spec for Markdown files Needs work so anything we do here shoud be aligned to the outcome there.

  • 🇬🇧United Kingdom jonathan1055

    The "pages" job does actually give warnings if internal links are not found, see this Coding Standards MR job. I case not everyone can view that log, it says at the start:

    $ mkdocs build --site-dir public $_MKDOCS_EXTRA
    INFO    -  Cleaning site directory
    INFO    -  Building documentation to directory: /builds/project/coding_standards/public
    INFO    -  Downloading external file: https://unpkg.com/mermaid@11/dist/mermaid.min.js
    INFO    -  Doc file 'css/architecture.md' contains an absolute link '/files/10.progress.png', it was left as is.
    INFO    -  Doc file 'php/coding.md' contains an absolute link '/node/542202', it was left as is.
    INFO    -  Doc file 'php/coding.md' contains an absolute link '/node/542202', it was left as is.
    INFO    -  Doc file 'php/documentation.md' contains an unrecognized relative link 'todo-to-do-notes', it was left as is.
    INFO    -  Doc file 'php/naming-services.md' contains an absolute link '/node/2083979', it was left as is.
    INFO    -  Doc file 'php/psr4.md' contains an absolute link '/project/bootstrap/issues/2763861', it was left as is.
    

    Then at the end we also get

    INFO    -  Doc file 'css/architecture.md' contains a link '#formatting', but there is no such anchor on this page.
    INFO    -  Doc file 'css/review.md' contains a link 'format.md#rtl', but the doc 'css/format.md' does not contain an anchor '#rtl'.
    INFO    -  Doc file 'php/documentation-examples.md' contains a link 'documentation.md#callbacks-for-built-in-PHP-functions', but the doc 'php/documentation.md' does not contain an anchor '#callbacks-for-built-in-PHP-functions'.
    INFO    -  Doc file 'php/documentation-examples.md' contains a link 'documentation.md#hook-menu-page-router-callback-functions', but the doc 'php/documentation.md' does not contain an anchor '#hook-menu-page-router-callback-functions'.
    INFO    -  Doc file 'php/documentation.md' contains a link '#file', but there is no such anchor on this page.
    INFO    -  Doc file 'php/documentation.md' contains a link '#defgroup', but there is no such anchor on this page.
    INFO    -  Doc file 'php/documentation.md' contains a link '#Plugin', but there is no such anchor on this page.
    INFO    -  Doc file 'php/documentation.md' contains a link '#types', but there is no such anchor on this page.
    INFO    -  Doc file 'php/documentation.md' contains a link 'documentation.md#arrays', but there is no such anchor on this page.
    INFO    -  Doc file 'php/documentation.md' contains a link '#ref', but there is no such anchor on this page.

    These are not highlighted in the log and can be easily missed if you are not looking for them. Is there any way we can make them more prominent, or even make the job end with an amber warning instead of green pass?

  • 🇪🇸Spain fjgarlin
    $ mkdocs build --help
    Usage: mkdocs build [OPTIONS]
    
      Build the MkDocs documentation
    
    Options:
      -c, --clean / --dirty           Remove old files from the site_dir before building (the default).
      -f, --config-file FILENAME      Provide a specific MkDocs config. This can be a file name, or '-' to read from
                                      stdin.
      -s, --strict / --no-strict      Enable strict mode. This will cause MkDocs to abort the build on any warnings.
      -t, --theme [mkdocs|readthedocs]
                                      The theme to use when building your documentation.
      --use-directory-urls / --no-directory-urls
                                      Use directory URLs when building pages (the default).
      -d, --site-dir PATH             The directory to output the result of the documentation build.
      -q, --quiet                     Silence warnings
      -v, --verbose                   Enable verbose output
      -h, --help                      Show this message and exit.
    

    From that, maybe we could/should use the -s flag?

  • 🇬🇧United Kingdom jonathan1055

    Ah, that's good. Yes I will test the -s option here.

  • Merge request !374#3492951 Add --strict to mkdocs command → (Merged) created by jonathan1055
  • Pipeline finished with Success
    15 days ago
    Total: 155s
    #527165
  • Pipeline finished with Success
    15 days ago
    Total: 250s
    #527367
  • 🇬🇧United Kingdom jonathan1055

    I have tested this using GTD MR22 with
    BEFORE_SCRIPT_ACTIONS: pages-fail and it does fail, which is great.
    https://git.drupalcode.org/issue/gitlab_templates_downstream-3515276/-/j...

    INFO    -  The following pages exist in the docs directory, but are not included in the "nav" configuration: - page-not-reachable.md
    WARNING -  A reference to 'not-a-page.md' is included in the 'nav' configuration, which is not found in the documentation files.
    INFO    -  Doc file 'link-failure-page.md' contains an absolute link '/absolute', it was left as is.
    INFO    -  Doc file 'link-failure-page.md' contains an unrecognized relative link 'non-existent-link', it was left as is.
    INFO    -  Doc file 'link-failure-page.md' contains a link '#no-anchor-in-same-page', but there is no such anchor on this page.
    INFO    -  Doc file 'link-failure-page.md' contains a link 'index.md#no-anchor-in-other-page', but the doc 'index.md' does not contain an anchor '#no-anchor-in-other-page'.
    Aborted with 1 warnings in strict mode!
    

    NR for feedback, but there are still (atleast) two things to check:

    1. We get a mix of 'info' and 'warning' messages. From the final line it looks like only the 'warning' gives the error exit code. But we need to check this.
    2. Test with allow_failure: true to see if that allows the job to end amber not red
  • 🇪🇸Spain fjgarlin

    That's great actually! I think we're on the right track. Let's try those two things that you mention, but yeah, I'd set the strict option by default for all pages jobs.

  • 🇬🇧United Kingdom jonathan1055

    Yes, this is a good discovery. Thanks for posting that help output.

    I'd set the strict option by default for all pages jobs

    I was wondering if we need to have it controllable by a pipeline variable, it would be on strict by default, but we don't know if there are edge cases that cannot be fixed and we are forcing the job to fail. So having a variable to override and not add the --strict would cater for that scenario.

  • 🇪🇸Spain fjgarlin

    Yup, it can be controlled by a variable, that's totally fine (and probably desired).

  • 🇬🇧United Kingdom jonathan1055

    I've added variable _MKDOCS_STRICT which defaults to --strict but can be erased if required.

    The other way would be to use a 0/1 flag and derive the value. We've used both approaches previously, but mainly when there are multiple options. Now that I have typed up this comment, maybe this one should actually be a 0/1 switch, as I think that's what we have used more times than not.

  • Pipeline finished with Success
    15 days ago
    Total: 1073s
    #527912
  • 🇬🇧United Kingdom jonathan1055

    Tested with allow_failure: true in GTD MR24 and the job ends amber not red as we hoped.
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/56...

    If the single 'warning' is not present (temporarily commented out the line which creates that) then the job does indeed end green. So all those internal checks which are shown as 'info' in the log do not cause a exit code.
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/56...

    It's a shame that the internal bad links and anchors do not throw a detecable exit code. But it is still worth adding --strict. Maybe we could still investigate https://github.com/tcort/markdown-link-check as that would also verify external urls.

    NW for the change to variable as per #11

  • Pipeline finished with Success
    14 days ago
    Total: 252s
    #528072
  • Pipeline finished with Success
    14 days ago
    Total: 33s
    #528076
  • Pipeline finished with Success
    14 days ago
    Total: 250s
    #528077
  • Pipeline finished with Success
    14 days ago
    Total: 86s
    #528085
  • 🇬🇧United Kingdom jonathan1055

    New logic for 1/0 works
    With _MKDOCS_STRICT: 1 (the default)
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/56... - fails red

    With _MKDOCS_STRICT: 0
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/56... - passes green

    I have also added a section in the pages.md - which can be viewed here

    Ready for review.

  • Pipeline finished with Success
    14 days ago
    Total: 116s
    #528086
  • 🇪🇸Spain fjgarlin

    If looks good. I just thought about a possible edge case, which we can deal with in here. So NW based on that.

  • Pipeline finished with Success
    12 days ago
    Total: 149s
    #529998
  • Pipeline finished with Success
    12 days ago
    Total: 51s
    #530158
  • 🇬🇧United Kingdom jonathan1055

    I've removed the temporary lines, and responded in the MR why I think the other line should stay as-is.

    This is ready for review. Are you OK with help doc paragraph?

  • 🇪🇸Spain fjgarlin

    Happy with your reasoning. My only suggestion about the new paragraph is just simplifying the title a little bit. It reads well.

    Marking RTBC for now, and either feel free to apply the suggestion to the title or to resolve if you think the current version is best.

  • 🇬🇧United Kingdom jonathan1055

    I've applied your suggestion, it was an improvement, thanks.

    When this is committed, are we still going to leave the issue open to investigate markdown-link-check as originally suggested? Internal link errors are shown in the log so they can be fixed, it's just that they do not cause an failure exit code. I'm ok with marking this fixed if you are?

  • 🇪🇸Spain fjgarlin

    Given that this option inside mkdocs checks for valid links, I'd consider it enough. The proposed solution was made when we didn't know this option exists. So, once merged, I was planning on setting it as "Fixed".

  • Pipeline finished with Failed
    11 days ago
    Total: 1324s
    #530274
  • 🇬🇧United Kingdom jonathan1055

    Yes that's good. In that case I may expand that final line in the docs, to say tto check the log when making navigation changes, as different errors are writen at the start and end of the log output.

  • 🇪🇸Spain fjgarlin

    In that case I may expand that final line in the docs, to say tto check the log when making navigation changes

    Are you planning to do that in this issue or in the one about documentation changes? Happy eitherway, just asking because of the RTBC status.

  • 🇬🇧United Kingdom jonathan1055

    Sorry, got delayed with other things. Will do it here.

  • Pipeline finished with Success
    10 days ago
    Total: 59s
    #531366
  • Pipeline finished with Failed
    10 days ago
    Total: 82s
    #531370
  • Pipeline finished with Success
    10 days ago
    Total: 51s
    #531376
  • 🇬🇧United Kingdom jonathan1055

    Done, was going to RTBC but better for you to check the updated help doc paragraph

  • Pipeline finished with Success
    10 days ago
    Total: 51s
    #532007
  • 🇪🇸Spain fjgarlin

    The changes look good. RTBC.

  • Pipeline finished with Skipped
    10 days ago
    #532238
  • 🇪🇸Spain fjgarlin

    Merged. Thanks!

  • Merge request !378Fix internal links. → (Merged) created by fjgarlin
  • Pipeline finished with Success
    10 days ago
    Total: 81s
    #532240
  • Pipeline finished with Skipped
    10 days ago
    #532243
    • fjgarlin committed 52be2826 on main
      Issue #3492951 by jonathan1055, fjgarlin: Use mkdocs --strict option to...
  • 🇪🇸Spain fjgarlin

    I checked, and the failures were legit as the link would point to a broken page.
    Fixed with https://git.drupalcode.org/project/gitlab_templates/-/merge_requests/378, which is merged now.

  • 🇬🇧United Kingdom jonathan1055

    Hey, that's really satisfying that it found two broken links. But of course! that job is only run on commit to main. If I had thought more about this I would have made a temporary change to run 'pages' in the mr to find these two errors before commit.

  • 🇪🇸Spain fjgarlin

    That's fine. I'm glad the job found the issues and that we could fix them, even if it was post-merge.

Production build 0.71.5 2024