Add blacklist and whitelist to flagWords

Created on 13 May 2025, 23 days ago

Problem/Motivation

Part of the Core initiative to removing usage of "blacklist" and "whitelist" is to add these words the to cspell 'flagwords'
so that once removed they are not added back - see 📌 Add blacklist and whitelist to the list of flagwords Active

Contrib can (and should I believe) also make these changes. In fact we do not need to wait for that core issue to be merged, we can add the four new words to the default assets/.cspell.json flagWords whenever we decide to.

Maybe there should be some announcement or change record, or just a post on Slack pointing to this issue, as this will make the cspell jobs fail if the blacklist/whitelist words are in use.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

📌 Task
Status

Active

Component

gitlab-ci

Created by

🇬🇧United Kingdom jonathan1055

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @jonathan1055
  • 🇬🇧United Kingdom jonathan1055

    Since preparing this issue, the Core issue has now been merges in 11.x so we should follow, as the original flagWords used in contrib pipelines was taken from the core file.

  • 🇬🇧United Kingdom jonathan1055

    If we added a change record, it could link to the core issues, and also explain about using cspell:ignore whitelist etc in places which cannot be changed yet (there are several in that Core MR)

    Do we need to give notice of this upcoming chanage, say in Slack? The default allow_failure for cpsell is true, so unless any contrib maintainer has explicitly set it to false, then the pipeline will not fail.

  • Pipeline finished with Success
    23 days ago
    Total: 922s
    #495690
  • 🇬🇧United Kingdom jonathan1055

    I have added BEFORE_SCRIPT_ACTIONS to test the downstream projects, but they failed already even without that commit.
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/52...

    This highlights another problem I had already noted about cspell, that it is checking the contents of build.env when I do not think it should. The maintainer does not have control over what goes in to this file, and things like branch names and references are written which fail spelling. This confuses the log, and obscures the real spelling errors.

    In this case, the job failed because that branch did not have its own .cspell.json so was taking the file from Core. The changes in this MR are needed when a project has its own .cspel.json which probably may not contain all the core flagWords.

  • Pipeline finished with Success
    23 days ago
    Total: 50s
    #495732
  • Pipeline finished with Success
    23 days ago
    Total: 1207s
    #495708
  • 🇪🇸Spain fjgarlin

    From the core issue (comment #7):

    It looks like chromedriver have deprecated --whitelisted-ips and replaced it with --allowed-ips, let's see if this works.

    We should do this here too I guess.

    --

    Re change record: yes, let's do it and link to the core issue. As you say, the default is to allow failure, so the overall pipeline result shouldn't be affected. I can make a quick slack post once we merge it, pointing at the change record.

    --

    Re build.env, yes, it should be ignored in the cspell job.

  • 🇪🇸Spain fjgarlin

    NW for the chromedriver args. The rest looks good.

  • 🇬🇧United Kingdom jonathan1055

    Before we change --whitelisted-ips in main.yml and main-d7.yml I would like to know why those did not fail the internal cspell job? We use the same scripts/prepare-cspell.php for our own checking, don't we?

  • 🇬🇧United Kingdom jonathan1055

    Unintended change, sorry. New comments arrive, but the form values remain unchanged, so when I reply it makes the change back to the old value.

  • 🇬🇧United Kingdom jonathan1055

    Of course. I was thinking that we merged the values from the contrib default if they were not in the internal file. But we don't do that. We could I suppose. But the simple solution is to just put them in both files. I'll do that for all the flagWords.

  • Pipeline finished with Success
    23 days ago
    Total: 48s
    #495867
  • Pipeline finished with Failed
    23 days ago
    Total: 195s
    #495879
  • 🇬🇧United Kingdom jonathan1055

    Adding the blacklist and whitelist words to /internal/.cspell.json made the 'check code' job fail as intended.
    https://git.drupalcode.org/issue/gitlab_templates-3524087/-/jobs/5240491
    I also re-ordered a couple of items in the file to make them the same as the default, for easier comparing.

    Now that we have all the other flagWords, I had to add <!--- cspell:ignore grey queuing --> into the cspell.md file which is fine. But interestingly the word 'e-mail' was not reported as a flagged word. Probably the hyphen is treated as a word-break? Maybe this does not work in Core either?

  • Pipeline finished with Success
    23 days ago
    Total: 52s
    #495915
  • Pipeline finished with Success
    23 days ago
    Total: 52s
    #495919
  • Pipeline finished with Success
    23 days ago
    Total: 201s
    #495932
  • 🇬🇧United Kingdom jonathan1055

    I have added GTD before_script_action to all the D9 and D10+ branches to used flagged words. The jobs all fail as required. Here is one example
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/52...

    Interesting that we get suggestions of the plural forms of the words:

    spelling-test.md:1:23      - Forbidden word (blacklisted) -- Now we are using some blacklisted whitelist words.
      Suggestions: [blacklists, blacklist's, backlist, backlists, blocklist]
    
    spelling-test.md:1:35      - Forbidden word (whitelist)   -- using some blacklisted whitelist words.
      Suggestions: [whitelists, whitest, whitefish, whiteouts, whitebaits]
    

    So I think we should add those forms too, then feed back to the c ore issue to add them also. There is no point in have 'blacklist' as a flagged word if we allow 'blacklists'

  • Pipeline finished with Success
    23 days ago
    Total: 53s
    #495940
  • Pipeline finished with Success
    23 days ago
    Total: 242s
    #495957
  • 🇪🇸Spain fjgarlin

    Maybe we can add the suggestions (eg: whitelist->allowlist). https://cspell.org/docs/api/cspell-types/interfaces/Settings#flagwords
    That way it will not recommend derivatives of the words.

  • Pipeline finished with Success
    23 days ago
    Total: 546s
    #496114
  • Pipeline finished with Success
    23 days ago
    Total: 985s
    #496134
  • Pipeline finished with Success
    23 days ago
    Total: 4775s
    #496200
  • 🇬🇧United Kingdom jonathan1055

    That's even better. I have tested this locally, and found out that suggestions are only given when the specific word is supplied with a suggestion, you don't get derivative-created suggestions. I've pushed the changes and here's the test.

    However, I have found that even though 'blocklist' is a known word, 'blocklists' and 'blocklisted' are not recognised.

    spelling-test.md:2:40      - Unknown word (blocklisted) -- of whitelists and the blocklisted denylist allowlists
    	 Suggestions: [blocklist, blockList, BlockList, blocklint, blacklist's]
    

    So maybe we should not be adding those as suggestions, just 'blocklist' and all the 'denylist' variations.

  • Pipeline finished with Success
    23 days ago
    Total: 691s
    #496399
  • 🇬🇧United Kingdom jonathan1055

    Now we do not get suggestions for 'blocklisted'
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/52...

    This is ready for review and feedback. Is it possible to verify the chromdriver change?

    -    - '--whitelisted-ips='
    +    - '--allowed-ips='
    

    Or do we just assume that because that is what was changed in core then it's correct here too?

  • 🇪🇸Spain fjgarlin

    Testing any pipeline that uses the legacy driver should be good enough, or any D7 pipeline.
    I think the change should be pretty safe because it actually happened a few years ago https://github.com/SeleniumHQ/selenium/pull/11409 upstream.

    I think that, other than the @todo, everything looks good. RTBC (pending the removal of the todo).

  • Pipeline finished with Failed
    22 days ago
    Total: 163s
    #496880
  • 🇬🇧United Kingdom jonathan1055

    Our GTD d9 branch uses the legacy driver, and those javascript tests passed, so that seems OK.
    I've removed the temporary test lines.
    NW because we were we going to create a Change Record

  • 🇬🇧United Kingdom jonathan1055

    I have created a draft CR but it may need more detail.
    https://www.drupal.org/node/3524446 →

  • 🇪🇸Spain fjgarlin

    Thanks for drafting it. I added some more text to indicate workarounds or actions to do if people want to fix the possible issues. I also mentioned that the overall status of the pipeline should remain unchanged.

  • 🇪🇸Spain fjgarlin

    If you are happy with the changes made to the CR, then we can merge this (probably tomorrow).

  • 🇬🇧United Kingdom jonathan1055

    Thanks. Yes those additions to the CR are good and I made a few edits.

    It's a shame that if a project has their own .cspell.json then they do not get any of the improvement added to Core. An alternative way to add this would have been via the $cspell_flagwords pre-processing in the prepare-cspell.php script. Then all Contrib would have got the changes even if they have their own file. Is that worth considering? Maybe I should have thought of that first? It would not be very hard to do.

  • 🇪🇸Spain fjgarlin

    Not sure we need to do that far. Everyone with the defaults will get it.

    Someone with a cspell configuration file might actually want full control of the job and might not want modifications on top of it.

    Something we could do is to detect this situation and if the flagwords that we recommend are not present then show a big warning message in the log. They might not see it at first but when they get a cspell error in the future they will see it.

  • 🇬🇧United Kingdom jonathan1055

    detect this situation and if the flagwords that we recommend are not present then show a big warning message

    That is a good idea. We don't alter their .cspell.json but if any of the flagwords in the default are missing from their custom file we recommend they add them. Better to do that in this MR, then update the Change Record to reflect it.

  • Pipeline finished with Success
    19 days ago
    Total: 83s
    #499679
  • Pipeline finished with Success
    19 days ago
    Total: 55s
    #499684
  • Pipeline finished with Success
    19 days ago
    Total: 82s
    #499690
  • 🇬🇧United Kingdom jonathan1055

    I have added a notice when the projects file does not contain all of the default flagwords.
    Here's the first test and rather ironically the test failed, not because of the new processin, but because the MR branch name contains the word 'flagwords' which is not recognised in the dictionaries. This only happens because the MR branch name is written into the .gitlab-ci.yml, it will not be a problem for real.

    I added that to the scheduler dictionary and here is the second test

  • 🇬🇧United Kingdom jonathan1055

    I have also committed a project .cspell.json to the d10-theme downstream branch, with only a couple of the default flagwords. This improves test coverage, and shows the advice message
    https://git.drupalcode.org/project/gitlab_templates_downstream/-/jobs/52...

  • 🇪🇸Spain fjgarlin

    Thanks for the additions. I left some small feedback in the MR to cater for different syntax of the flagWords option.

  • 🇬🇧United Kingdom jonathan1055

    Good spot on the : syntax. I have made that change, I just used : as the replacement for -> instead of + then tokenized on :
    I also used the : syntax in one set of the suggested words in the assets files, to prove the change covers both syntaxes.
    Ready for review.

  • Pipeline finished with Success
    17 days ago
    Total: 553s
    #500755
  • 🇪🇸Spain fjgarlin

    Great. I think this ready.

    I suggest that the last comment in the MR is made into a follow-up issue.
    Once I merge this, I will also publish the change record.

  • 🇬🇧United Kingdom jonathan1055

    I have added 📌 Check when core .cspell.json changes and keep the assets files aligned Active following the MR discussion.

  • Pipeline finished with Skipped
    17 days ago
    #500816
  • 🇪🇸Spain fjgarlin

    Merged and change record published.

  • Automatically closed - issue fixed for 2 weeks with no activity.

  • 🇬🇧United Kingdom jonathan1055

    This commit has now just been included in default-ref
    For info I did a search on how many of these words are currently in the contrib codebase
    Search for blacklist returns 2,502
    Search for whitelist returns 3,751

    Many of these will have to be ignored, not fixed. So we could see an increase in the simple counts. Maybe we can work out some regex which would exclude the "cspell:ignore" lines, so we can then see how these counts change.

Production build 0.71.5 2024