Investigate using git-restore-mtime package for cacheing

Created on 29 July 2024, 10 months ago
Updated 18 September 2024, 8 months ago

Problem/Motivation

Caching for cspell uses content caching strategy right now. This is good, but means hashing all files. We could investigate using the git-restore-mtime package. This will set mtime to the git history and could mean we can use the metadata cachings strategy. This *might* improve the runtimes even more.

https://stackoverflow.com/questions/2458042/restore-a-files-modification...

Quick testrun of impact of running git restore-mtime:

[32;1m$ git restore-mtime[0;m
18,163 files to be processed in work dir
Statistics:
         0.22 seconds
       18,421 log lines processed
           50 commits evaluated
        4,057 directories updated
       18,163 files updated

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

Introduced terminology

API changes

Data model changes

Release notes snippet

✨ Feature request
Status

Active

Version

11.0 🔥

Component
Other  →

Last updated about 11 hours ago

Created by

🇳🇱Netherlands bbrala Netherlands

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @bbrala
  • 🇬🇧United Kingdom catch

    I think to do this we might want to install it on the docker images themselves to save the extra apt install step in each job.

    eslint and stylelint have the same two cache strategy options so this would benefit all three if it's an improvement.

  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Fascinating! 🤓

    I have actually always wondered why this wasn't git's default behavior. I've found it confusing in the past. Thanks for teaching me, @bbrala 😄

  • 🇳🇱Netherlands bbrala Netherlands
  • 🇳🇱Netherlands bbrala Netherlands

    Since we now fixed the cache retrieval for all jobs we should try this. Cspell with cache is still 1.5 minutes.

  • 🇳🇱Netherlands bbrala Netherlands

    Todo to validate, seems pretty straightforward,

    1. Add package to warm cache and lint jobs
    2. Adjust warm cache job to always run
    3. have lint jobs depend on that job and get the cache artifacts, double check they use metadata instead of file contents, that is either commandline argument or config change.
    4. Validate runtime against HEAD (minus the (pseudocode) apt update && apt install--y git-mtime && git mtime)
    5. Decide if we want it
    6. If so, add package to images
    7. Adjust cache and lint jobs to run the mtime fix
  • 🇳🇱Netherlands bbrala Netherlands

    Not gonna work if we use sparse checkouts, that means the minimum time is the 50th commit time. There might be ways around a full clone being slow by caching a sparse cron or something. But that seems like it's overkill.

    There actually still might be merit to do this since it seems we do generate the cache every commit (push), so the state of the repo is kinda the same as the cache when the MR would run (cached on -50 commits, clones in MR at -50 commits, then merging changing and running linters).

    Ok, think it might still be worth it.

  • Merge request !9673Draft: Resolve #3464413 "Baseline stats" → (Open) created by bbrala
  • Pipeline finished with Failed
    8 months ago
    Total: 896s
    #295472
  • Pipeline finished with Failed
    8 months ago
    Total: 365s
    #295487
  • Pipeline finished with Canceled
    8 months ago
    Total: 244s
    #295504
  • Pipeline finished with Failed
    8 months ago
    #295503
  • Pipeline finished with Failed
    8 months ago
    Total: 414s
    #295507
  • Pipeline finished with Canceled
    8 months ago
    Total: 550s
    #295506
  • 🇳🇱Netherlands bbrala Netherlands

    Well, that didnt really have any results.

    https://git.drupalcode.org/issue/drupal-3464413/-/pipelines/295506

    vs

    https://git.drupalcode.org/issue/drupal-3464413/-/pipelines/295507

    All times are comparable for the lint runs. Nothing much to do there it seems. I changed to metadata and got all the things as artifacts, but it didnt really speed anything up. So unless i screwed up the changes, not worth it.

  • 🇬🇧United Kingdom catch

    I think it might be at least good to keep the time on everything? It's useful to see the timings as much as possible.

    I hadn't realised we're only spending less than 10 seconds on each actual linting step, while that means we probably need to look in the other 1m20s for savings

Production build 0.71.5 2024