Created on 19 February 2024, 4 months ago
Updated 18 March 2024, 3 months ago

Problem/Motivation

Following from #3405955: Add job for CSPELL in contrib pipeline we can improve the output and make it easier for maintainers to use the job and start fixing the mis-spelled words.

[edit] During the work here a problem was found where the files in the top-level project folder are not being checked.

Proposed resolution

1. Write the number of mis-spellled words to the log
It would be really informative to show the number of different unrecognised words in the log, right after the cspell summary. This shows instant feedback on how the latest change performed, gives a measure of progress and serves as encouragement to get the number lower.

2. Use a better .cspell.json
Currently if the project does not contain a .cspell.json we create a very plain one, with no dictionaries defined. This means that Drupal words for theme names, and php function names are all reported as unrecognised. Core uses two dictionaries core/misc/cspell/drupal-dictionary.txt and core/misc/cspell/dictionary.txt and other pre-defined dictionaries. These could be added by default as there is no project that can run cspell and will need these two dictionaries.

3. Check the files in the top-level folder
See issue comment #7-12 for details.

Remaining tasks

All three tasks above are done.

Feature request
Status

Fixed

Component

gitlab-ci

Created by

🇬🇧United Kingdom jonathan1055

Live updates comments and jobs are added and updated live.
  • Contributed project blocker

    It denotes an issue that prevents porting of a contributed project to the stable version of Drupal due to missing APIs, regressions, and so on.

Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jonathan1055
  • 🇧🇪Belgium Wim Leers Ghent 🇧🇪🇪🇺

    This blocks #3347168-8: [PP-1] Match core's code quality checks: cspell .

    Thanks for pushing this forward, @jonathan1055! 🤩

  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #98516
  • 🇬🇧United Kingdom jonathan1055

    First commit shows the number of words in the log.

    I also experimented by adding the --color parameter, see screengrab, from https://git.drupalcode.org/project/scheduler/-/jobs/850145 I think this is actually harder to read, but I don't know if the gitlab colors are customisable on a per-user basis?

  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #98600
  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #98631
  • Pipeline finished with Success
    4 months ago
    Total: 30s
    #98752
  • Status changed to Needs review 4 months ago
  • 🇬🇧United Kingdom jonathan1055

    I have added a first draft of the .cspell.json file, stored in the new /assets directory (as agreed with fjgarlin on Slack). This is only used if the project does not have their own. I have also added the provision for the project to have their own custom dictionary file called .cspell-project-words.txt This is defined in the .cspell.json but is optional because an empty file will be created if none exists. This should allow most projects to just use the .cspell.json that we provide and not need their own. They just need to create a .cspell-project-words.txt file if they have special words. There may be scope to add in the contents of $_CSPELL_WORDS but I think that is not so important now.

    A test on Scheduler without the new .cspell.json gave

    CSpell: Files checked: 176, Issues found: 727 in 116 files
    The number of unrecognised/misspelled words is 108
    

    But with the new default this is reduced to

    CSpell: Files checked: 176, Issues found: 143 in 50 files
    The number of unrecognised/misspelled words is 57
    

    Most of these 57 words are actual real spelling mistakes. There are other words that refer to common drupal terms, such as Taxonomy, nodetype, mediatype which are included in the Drupal dictionary. I would have thought they should be. Anyway, that's for later. This is ready for review. It can be tested using the following in a .gitlab-ci.yml

      # MR 131
      - project: issue/gitlab_templates-3422323
        ref: 3422323-improve-cspell
    
  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #98824
  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #99268
  • 🇬🇧United Kingdom jonathan1055

    Odd problem in testing. The sed command works for adding in the CSPELL_WORDS, but it fails on CSPELL_IGNORE_PATHS. Any ideas? I have dispayed the file at the end in after_script
    https://git.drupalcode.org/project/scheduler/-/jobs/857419

  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #99371
  • Pipeline finished with Success
    4 months ago
    Total: 4838s
    #99463
  • Status changed to Needs work 4 months ago
  • 🇬🇧United Kingdom jonathan1055

    I have just noticed that no files in the project root folder are being checked. All files in one sub-directory and down are included, but top-level files are not. See https://git.drupalcode.org/project/scheduler/-/jobs/861496 where I have removed --no-progress so that we can see every file being checked.

    In a question about globRoot fjgarlin pointed to 📌 Fix cspell use: specify globRoot and always pass --root to cspell Fixed where globRoot and --root=$TOP_LEVEL were added. I have tried adding < code>--root $CI_PROJECT_DIR/$_WEB_ROOT/modules/custom/$CI_PROJECT_NAME and also --root=.. and speficying the files to search with $CI_PROJECT_NAME/**/* but neither of these are correct. More testing required.

  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #100269
  • First commit to issue fork.
  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #101306
  • 🇬🇧United Kingdom jonathan1055

    I was investigating why files in the top level of the project are not checked, spent a long time, going round all the possible combinations, trying to see how globRoot affects the problem, how --root might be involved, and the various ways to specify what files to check (*, **, **/*, "**", ./** etc) The answers were just not making sense. I could get all files in the entire web/modules/contrib tree, simultaneously with nothing from web/modules/custom. I could get all files in the lower directories, just not those at the top level. The problem discussed and resolved in 📌 Fix cspell use: specify globRoot and always pass --root to cspell Fixed did seem very relevent as it was talking about the relative folders where the .cspell.json file lives, and the current directory from where the command is executed. I tried executing the cspell from one directory up, I moved the .cspell.json file one folder up, I used full absolute path for the config and for the files to test. Nothing was getting any nearer.

    It was driving me crazy, but then I had a lightbulb moment, and wondereid if the files were not being found due to the symlinks from modules/custom/$CI_PROJECT back to $CI_PROJECT_DIR.So I created new files at the top level, not in the committed codebase but on-the-fly in the script. This was it! These new files, real files not symlinks, were discovered and tested by cspell.

    So now we need to work out if we can make cspell read symlinked files. The alternative would be to copy the files not symlink them. I did have an earlier problem with file permissions due to the files being symlinked not existing as real files. That was in the ESLINT auto-fixes patch, but I did things a different way and the problem resolved itself. But we need to have a discussion about any downsides to copying compared to symlink, as I propose we change the symlink script to solve this cspell problem, and others.

  • Pipeline finished with Success
    4 months ago
    Total: 33s
    #101349
  • 🇪🇸Spain fjgarlin

    +1 to create an issue and investigate copy vs symlink.

    --

    This is weird. If it's due to symlinks, it wouldn't do the other symlinked files in deeper directories.

  • Status changed to Needs review 4 months ago
  • 🇪🇸Spain fjgarlin

    I think I found a workaround for this, for now. Copy the module folder into another location as actual files, and then run the job in that folder.

    Job before the changes: https://git.drupalcode.org/project/keycdn/-/jobs/879355 (we can see how it ignores the top level files)
    Job with additional output: https://git.drupalcode.org/project/keycdn/-/jobs/880303 (top level files are processed)

    Final MR: https://git.drupalcode.org/project/gitlab_templates/-/merge_requests/131...

  • 🇬🇧United Kingdom jonathan1055

    This is weird. If it's due to symlinks, it wouldn't do the other symlinked files in deeper directories.

    I noticed something odd with the symlinks before. I looked like it is just the top-level folders that are symlinked, not the actual individual files within those folders. So those files in the lower folders appear different compared to files in the top folder. You are not allowed to change directory into a sub-folder, for example (see screen grab).

    New version looks like a reasonable workaround temporarily, but we do need to follow up with a better solution. Will test and feedback.

  • Status changed to Needs work 4 months ago
  • 🇬🇧United Kingdom jonathan1055

    It worked, the base-folder files are checked, but there are few things to fix.

    1. The new folder $CI_PROJECT_DIR/gitlabci_cspell contains one folder $PROJECT_NAME, you can see in the log each file starts ./keycdn/ If the cp command ends with / then we avoid this extra folder and that is preferrable for the output
    2. I presume the sed following "Replace relative paths for absolute to run it in this job" is needed only for this workaround. Could add a comment saying that, so we can remove that line if the copy is reverted.
    3. The downloaded .cspell.json artifact does not have these sed changes in. Need to move the cp after the sed.
    4. There is a problem with using the .cspell-projects-words.txt. It did work, but is not recognised now. Maybe due to sequence of processing, and when the copy is done. Having the .cspell.json as artifact will help debug this.

    I can add these few tweaks and test again.

  • 🇪🇸Spain fjgarlin

    1. Good suggestion. Trying now.
    2. Correct. I added a "@todo" next to the two places that can be reverted once the symlinks issue is resolved.
    3. I did it there on purpose, so the artifact for the maintainers would be a valid one and make sense, otherwise it'll have the GitLab CI absolute paths, which will not make it work locally. I don't think we need to change this.
    4. That's weird as we are creating it in the folder and then the .cspell.json file should look for it in the same folder.

    Would you be ok investigating 4?

  • Status changed to Needs review 4 months ago
  • 🇨🇦Canada mparker17 UTC-4

    This is great, and I was super-excited to see it when I was trying to add GitLab CI support to one of the projects that I maintain yesterday evening (sitemap)!

    However I was pretty confused about how to ignore words... after much late-night confusion, and a good night's sleep, I found this issue, and I am happy to see that (a) there will be a .cspell-project-words.txt file to put project-specific words in, and (b) I can put comments in that file (which I cannot do in .cspell.json, i.e.: in the "words" list.

    That being said, do we have a plan to document the new cspell lint, and .cspell-project-words.txt? .cspell-project-words.txt doesn't seem to be a "well-known" filename (although I did happe to notice it being used in 📌 Use CSPELL to fix spelling errors and add custom dictionary Active after searching Drupal.org for "cspell").

    This is a bigger issue unrelated to this (look for a ticket soon!), but other than the GitLab CI on Drupal.org overview page , we don't really have a place to document configuration for each of the linters.

    That being said, I daresay spelling mistakes are common enough to warrant a short blurb about how to ignore them in either the "FAQ" section or the "Common tasks" section of the overview page... something like...

    #### How do I ignore words that CSpell thinks are spelling mistakes?

    The simplest solution is to add a .cspell-project-words.txt file to your project, which follows CSpell's Words List Syntax (i.e.: one word per line; lines starting with # are comments, words prefixed with ~ are matched case-insensitively, etc.).

    For more advanced configuration, you can add a .cspell.json file in your project, which follows CSpell's configuration syntax. Start by copying [Drupal.org's Gitlab Template](link to the assets/.cspell.json added in this merge request), and customizing it to your needs.

    To skip CSpell checking altogether, add a SKIP_CSPELL: "1" variable to your project's .gitlab-ci.yml.

  • Status changed to Needs work 4 months ago
  • 🇨🇦Canada mparker17 UTC-4

    Oops I accidentally changed the status! Sorry about that; putting it back.

  • 🇬🇧United Kingdom jonathan1055

    Thanks mparker17, yes we will be documenting these things and/or pointing to existing external pages, eg https://cspell.org/docs/
    The name of the file, I got that from a suggestion on the cspell site. The issue you linked to was my own initial attempt at doing this. The work is very new for Contrib, so please excuse us if all the docs and help are not there yet. The process is still being honed.

  • Status changed to Needs review 4 months ago
  • 🇪🇸Spain fjgarlin

    I think points 1, 2 and 4 are now fixed.
    3 needed no action in my opinion.

    And yesss, we defo can and should follow up with documentation updates.

    Back to needs review.

  • 🇪🇸Spain fjgarlin

    All feedback in comments here and the MR has been addressed and is ready for review.
    I added an _EXTRA variable to be inline with other jobs.

  • Status changed to RTBC 4 months ago
  • 🇬🇧United Kingdom jonathan1055

    Tested on https://www.drupal.org/project/cdn/issues/3347168#comment-15459111 📌 [PP-1] Match core's code quality checks: cspell Postponed
    all looks good. This is RTBC!

  • Pipeline finished with Skipped
    4 months ago
    #102167
  • Status changed to Fixed 4 months ago
  • 🇪🇸Spain fjgarlin

    This is merged now, thanks so much!! Great improvement.

    It'd be great if we can follow up with an update to the docs as mentioned in #15 and #17.

  • Issue was unassigned.
  • 🇬🇧United Kingdom jonathan1055

    Updated issue summary to include fixing the problem of top-level folder being ignored.

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.69.0 2024