- Issue created by @fjgarlin
- ๐ช๐ธSpain fjgarlin
The suggested code
exclude: - '.git' - '.git/**/*' - '$_WEB_ROOT/**/.git' - '$_WEB_ROOT/**/.git/**/*' - 'vendor/**/.git' - 'vendor/**/.git/**/*'
is set in the MR in the linked issue and it's working.
- ๐ฌ๐งUnited Kingdom jonathan1055
So the actual addition is
- '$_WEB_ROOT/**/.git' - '$_WEB_ROOT/**/.git/**/*'
but you are also changing
vendor/**/.git/*
tovendor/**/.git/**/*
in d10 to match what we already had in main-d7. So the two sets are now identical. All looks good.In https://git.drupalcode.org/project/contribution_records/-/merge_requests/3 can we change that to use this MR and drop the composer customization? That would prove we have this right.
- ๐ฌ๐งUnited Kingdom jonathan1055
I changed MR3 to use and test this MR336 and in fact more files were ignored. The previous run, with custom composer artifacts, the log shows
.: found 102676 matching artifact files and directories web/**/.git: excluded 1 files web/**/.git/**/*: excluded 49 files .git: excluded 1 files .git/**/*: excluded 19 files
In the pipeline using MR336 we get
.: found 102676 matching artifact files and directories .git: excluded 1 files .git/**/*: excluded 26 files <<< this has increased, was 19 above web/**/.git: excluded 1 files web/**/.git/**/*: excluded 49 files
But I don't know if you want to investigate what those extras are?
Also 102,676 files! Is there any scope to ignore more? I already have in my notes of things to raise, the fact that the composer artifact is huge when downloaded (for example 720Mb). But am I right in thimking it's the same set of files that are needed for the subsequent jobs? Just wondering if we can specify a smaller subset when downloading, or is it one and the same thing?
- ๐ฌ๐งUnited Kingdom jonathan1055
The other thing I noted is that the composer artifacts definition does not have any 'name:' key so the downloaded file is just called artifacts.zip. Same with 'upgrade status. But all the other jobs have
name: artifacts-$CI_PIPELINE_ID-$CI_JOB_NAME_SLUG
Can we change that to match? If so, shall we do it in this MR? - ๐ช๐ธSpain fjgarlin
Happy to include #6 in this MR or in a separate issue.
Re #5, not sure what the extra files are, but in any case, we should just ignore anything ".git" related as that has the potential to be really big.
+100K files and +700Mb is heavy, but as long as it doesn't contain git files it should be file. Think that the "vendor" folder is part of the artifact and so is all of Drupal core.
The alternative to not pass this big artifacts between jobs is to run the composer install commands inside each job, but that'll be a big change. There is an issue for this ๐ Consider merging the build and validate stages Active , which I tagged 2.x because I don't know if we could ever achieve BC on it. But in any case, so far there is no problem with this big artifacts as they are deleted after a few weeks automatically.
--
So, if doing #6 is quick and easy let's do it here, otherwise as a follow up and maybe we can RTBC this one. Happy either way.
- ๐ฌ๐งUnited Kingdom jonathan1055
The alternative to not passing this big artifacts between jobs is to run the composer install commands inside each job, but that'll be a big change.
So I think that answers my question "... can specify a smaller subset when downloading, or is it one and the same thing?"
The composer artifact is deleted in 1 week, which is good. All the others are 6 months.I have pushed the change to specify a name for the composer and upgrade-status artifacts. Here is the re-run of Contribution Records MR3 - artifact name is ok.
I also tested the GTD MR7 (which runs d9-basic on Drupal 9 and 10 with Upgrade Status to check D11) manually via UI specifiying this MR to test against. The two composer jobs correctly have the
$CI_PIPELINE_ID-$CI_JOB_NAME_SLUG
added, making it possible to download them both without a name clash. The Upgrade Status job likewise has the correct artifact name. Here is that pipelineI did notice in the composer artifacts, there are several vendor projects which have
.github
files. They are not large, of course, but could they also be ignored? Would simply changing.git
to.git*
in all the filter rows achieve that? It might be worth just seeing how much that reduces the file number and overall size? Other than that question, this would be RTBC. - ๐ฌ๐งUnited Kingdom jonathan1055
I pushed the change to ignore
.git*
and it made a bit of a difference to the number of files in the logweb/**/.git*/**/*: excluded 478 files .git*: excluded 3 files .git*/**/*: excluded 26 files vendor/**/.git*: excluded 53 files vendor/**/.git*/**/*: excluded 85 files web/**/.git*: excluded 435 files
The download size unzipped was only reduced by a few MB and there are still
.github
folders downloaded, so I don't exactly understand how thatexclude:
is meant to work. I thought I did, but clearly not.This is RTBC if you want to get on and do it. The last commit can stay in, as it does not appear to do any harm?
- ๐ช๐ธSpain fjgarlin
Let's revert the last commit, as that could be too greedy and ignore files needed for pipelines, like this folder: https://git.drupalcode.org/project/drupal/-/tree/11.x/.gitlab-ci?ref_typ...
I think just targeting ".git" and ".git/**" as it was before that commit should be enough for now. If we need to get deeper into excluding more files this could be a follow-up, but this one should be ready as a "quick" improvement.
- ๐ฌ๐งUnited Kingdom jonathan1055
OK fine with me, I've reverted that.
RTBC -
fjgarlin โ
committed cb97ffe2 on main
Issue #3510977 by jonathan1055, fjgarlin: Ignore more git files
-
fjgarlin โ
committed cb97ffe2 on main
- ๐ช๐ธSpain fjgarlin
Merged. Thanks for the reviews, the extra addition and the tests.
- ๐บ๐ธUnited States cmlara
Just a note this should likley have been done as a v2 only change as it still can break pipelines.
As noted in ๐ Ignore all git files in artifacts Active I have used jobs in the past that depend upon git being present in the modules folder (my workflow involves copying all of the modules code to the custom folder and working out of the custom folder for all later steps as itโs better aligns to core design and avoids symlink faults)
and the CI artifacts can grow really big so it fails.
The better solution for this would likley be to not depend upon the artifacts being built until needed. We need to deal with the lack of caching on d.o however over in Quasar Iโve been designing (in loca testing) so that my phpunit and phpstan stages can be built โon demandโ without storing the large asset files. Similar can be done as a v2 for gitlab_templates and is somewhat a design expectation if the templates ever move to components.
- ๐ช๐ธSpain fjgarlin
This was more of a bug fix than a feature addition. We never intended to pack ".git" files in the artifacts.
The workaround, if you do need to have the ".git" files in the artifacts, would be to override the
.composer-base:artifacts:exclude
section.Building "on demand" is what's suggested in ๐ Consider merging the build and validate stages Active and it's definitely a 2.x must-have, to avoid scenarios like this.
Automatically closed - issue fixed for 2 weeks with no activity.