Exclude node_modules folder from artifacts

Created on 3 July 2025, 2 days ago

Problem/Motivation

We often get pinged in the #gitlab channel or the issue queue about ERROR: Uploading artifacts as "archive" to coordinator... 413 Request Entity Too Large.

We've had some issues where we have already ignored ".git" files, which helps when bringing "dev" packages.

Howerver, looking at the size of one module artifact for the composer job, we can see (first numeric value is MB):

$ du -m ~/Downloads/artifacts-538085-composer | sort -nr | head -n 10

645	~/Downloads/artifacts-538085-composer
546	~/Downloads/artifacts-538085-composer/web
489	~/Downloads/artifacts-538085-composer/web/core
364	~/Downloads/artifacts-538085-composer/web/core/node_modules
118	~/Downloads/artifacts-538085-composer/web/core/node_modules/@ckeditor
98	~/Downloads/artifacts-538085-composer/vendor
66	~/Downloads/artifacts-538085-composer/web/core/modules
57	~/Downloads/artifacts-538085-composer/web/modules/contrib
57	~/Downloads/artifacts-538085-composer/web/modules

We can see that "node_modules" is responsible for more than 50% of the size of the artifact. It appears that the size of this folder for D11 is even bigger than for D10, so pipelines might be failing purely because of this.

This issue is related 📌 Consider merging the build and validate stages Active , but probably broader in scope. This one is only about "node_modules" folder and we'd get a +50% reduction in artifact size.

Steps to reproduce

Example: https://git.drupalcode.org/project/contribution_records/-/jobs/5765518

Normal module with very few dependencies.

Proposed resolution

Exclude the "node_modules" folder in the "composer" artifact and run yarn install in the subsequent jobs where needed.
- stylelint
- eslint
- cspell
- .nightwatch-base

If we don't want to risk BC (not sure how), we could do this behind a variable.

Remaining tasks

MR

📌 Task
Status

Active

Component

gitlab-ci

Created by

🇪🇸Spain fjgarlin

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @fjgarlin
  • Merge request !381Variable to control when yarn install is run. → (Open) created by fjgarlin
  • 🇪🇸Spain fjgarlin

    MR in place and ready for review. I triggered KeyCDN and Decoupled pages but can't do the GTD (I was removed from the maintainers in order to test #3426311: Allow testing documentation pages via MRs ).

  • 🇺🇸United States drumm NY, US

    I’m a bit wary of running yarn install 4x. That will be extra compute, which is generally more expensive than storage. And it is additional network requests, that have potential to get noticed by upstream providers. We don’t really have a good setup to know the real cost tradeoff.

    If we were able to do things like skip eslint & stylelint when JS & CSS aren’t changed in a MR, this would be a great approach. That should be a net reduction in compute.

    Making a separate node_modules job/artifact might be an option, although it’d be more wall time to get through each pipeline, don’t know if that’s a good idea.

    For now I’ve raised the artifact size limit to 250M to give us a bit more breathing room.

  • 🇪🇸Spain fjgarlin

    Most jobs aren’t run if there aren’t files of certain type present (eg: no phpunit job will trigger if there are no php files), so that’s already taken.

    With the new limit, let’s see how it holds.

    This solution can be a plan B if needed or even useful if people have similar issues, so they can customize their integrations.

    So I’ll close it for now.

  • 🇬🇧United Kingdom catch

    Re-opening (possibly briefly, don't mind if it's closed again but I was looking for an issue like this and found this one) to link to 🐛 drupal_phpunit_find_extension_directories() uses infinite recursion ⇒ more directories = slower tests Needs work and https://git.drupalcode.org/project/experience_builder/-/merge_requests/4...

    Short version is having node_modules when discovering phpunit tests can add potentially minutes to contrib pipelines.

    We might be able to fix the core issue, but it's been stalled for years now. But I'm also wondering if there's a way to make the download of artifacts into the phpunit jobs more targeted so that node_modules is excluded? I've done that with individual files in the core pipeline for lint job caching but not directories yet.

    For core we removed our composer build and lint stages because the time taken to composer or yarn install is a lot less than the time taken to get a gitlab runner to then run those and upload them as artifacts for the next stages to use 📌 Merge the build and lint stages in core MR pipelines Downport . There's also 📌 Consider merging the build and validate stages Active for gitlab_templates.

  • 🇬🇧United Kingdom jonathan1055

    In reply to #3
    @fjgarlin I will add you back as maintainer of GTD. But strangely I just tried and cannot trigger the downstream pipelines either. Just get the red "An error occurred while making the request."

    In reply to #5 and #6

    If we were able to do things like skip eslint & stylelint when JS & CSS aren’t changed in a MR, this would be a great approach. That should be a net reduction in compute.

    We have 📌 Only run linting jobs if the files changed make sense for the job Active which sounds like what you want.

  • 🇪🇸Spain fjgarlin

    Perhaps we can go forward with this issue as opt-in, that way modules like XB could just turn a variable on and use this approach.

  • 🇪🇸Spain fjgarlin

    I did the above.

    The new variable's default value makes everything behave the same as up until now, so no change will be pushed unless opted in, and we've documented it as well.

    We don't need the change record in this case, so I'll remove that, but the MR is ready for review again.

  • 🇪🇸Spain fjgarlin

    @jonathan1055 - I was able to retrigger GTD pipelines again, thanks!
    https://git.drupalcode.org/project/gitlab_templates/-/pipelines/538928

  • 🇬🇧United Kingdom jonathan1055

    I made one more edit/suggestion in the doc page.

    Do you think the GTD tests are enough? Shall I try this with Scheduler?

  • 🇪🇸Spain fjgarlin

    Yes please, try with scheduler. That's got way more tests.

    I am assuming that "node_modules" is never needed for PHPUnit tests. In any case, this is opt-in and it is documented how to add it to any job, so it shouldn't be an issue. But yes, let's test scheduler with this feature turn on and off and see the results.

  • 🇬🇧United Kingdom jonathan1055

    Test using MR381 with no changes, so installs node_modules as usual. Here is the full pipeline - all passed.
    The composer log shows

    Top 10 folders by size:
    489	web/core
    364	web/core/node_modules
    118	web/core/node_modules/@ckeditor
    66	web/core/modules
    39	web/core/node_modules/ckeditor5
    37	web/core/node_modules/ckeditor5/dist
    30	web/core/node_modules/ckeditor5/dist/browser
    18	web/core/node_modules/selenium-webdriver
    17	web/modules/contrib
    17	web/modules
    

    The downloaded artifact zip is 142Mb. Unzipped 643MB

    Test using MR381 with _COMPOSER_YARN_INSTALL: 0
    Full pipeline all passed.
    The composer log does not show web/core/node_modules. We see

    Top 10 folders by size:
    125	web/core
    66	web/core/modules
    17	web/modules/contrib
    17	web/modules
    16	web/core/assets
    15	web/core/lib/Drupal
    15	web/core/lib
    15	web/core/assets/vendor
    13	web/core/tests
    13	web/core/assets/vendor/ckeditor5

    The downloaded artifact zip is 57Mb. Unzipped 261MB, a significant decrease.

    The eslint, stylelint and cspell jobs have this, which I don't understand, but is probably nothing to worry about:

    ➤ YN0007: │ @nightwatch/nightwatch-inspector@npm:1.0.1 must be built because it never has been before or the last one failed
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT 
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT > @nightwatch/nightwatch-inspector@1.0.1 build
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT > node preprocessExtension.js
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT 
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT 📂 Moving to src folder ...
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT 🚀 Creating .crx file ...
    ➤ YN0000: │ @nightwatch/nightwatch-inspector@npm:1.0.1 STDOUT ✅ .crx file created successfully!
    

    Note that this only tests ESlint, Stylelint and CSpell, because Scheduler does not have any Nightwatch tests. I've made a couple of comments in the MR but it all looks good.

Production build 0.71.5 2024