Millions of cache files create issues

Created on 13 April 2025, 13 days ago

Problem/Motivation

I installed and activated the module on a site where the cache queries were putting a hard load on the database, and it immediately lifted the weight and the site was as fast as ever.

Great!

But after having the module active for a few days, the server disk was full. The filecache directory contained more than 9 million files with ./page and ./render being the directories with most files and hundreds of GB data in each. This essentially halted all operations and strangled mysql bringing the site down, and I had to clear out files manually via FTP and SSH.

With this number of files, this was actually a problem because the sheer number in each directory posed problems for many tools and commands - like "argument list too long" when trying to delete files.

Once the space was cleared and the site up again, I tried to clear the render cache via the UI (Admin Toolbar), but this ran for very long and ended with a 500 error. A drush cr command seemed to have no effect.

After freeing the space, the site ran again, but cache files are still piling up with thousands added by the hour. The site broke down again and I removed some random cache files to revive it.

I wonder what actually will clean out these files - or at least keep their numbers from growing?

Steps to reproduce

Don't really know actually, but this might also happen on other sites ...
I just installed and configured the module and let it run.
The site is fairly big (50k nodes, 5k registered users) with decent traffic, but not that huge.

Proposed resolution

Some way of limiting the number of cache files - by age, size, number or whatnot. I can't see that any current Drupal setting can control this.

I considered doing this with some snazzy Linux command and cron, but would prefer something built into Drupal for more control.

I also looked for other modules, which might help, but the one contender I found - https://www.drupal.org/project/cacheflush/ - was not D11-ready. I tried adding D11 support on my local server, and that worked. The module also handles the filecache files, so it might be a solution. It seems to offer a more granulated cache control, but I still need to configure it to run to try it properly.

Feature request
Status

Active

Version

1.2

Component

Code

Created by

🇩🇰Denmark martin joergensen

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @martin joergensen
  • 🇩🇰Denmark martin joergensen

    I can see from earlier issues that there is already focus on the potentially huge number of files that File Cache can generate.

    For for now I solved my problem with a script that simply deletes the oldest files. This seems to have very little influence on the performence of the site, but certainly keeps the number of files in a manageable range.

    For whatever it's worth to others, here's my script. I called it delete_oldest.sh:

    #!/bin/bash
    
    # Constants
    NUM_FILES_TO_DELETE=1000
    
    # Usage function
    usage() {
      echo "Usage: $0 [-y] /path/to/directory"
      echo "  -y    Skip confirmation prompt"
      exit 1
    }
    
    # Check for arguments
    CONFIRM=false
    while getopts ":y" opt; do
      case ${opt} in
        y )
          CONFIRM=true
          ;;
        \? )
          usage
          ;;
      esac
    done
    
    # Shift out processed options
    shift $((OPTIND -1))
    
    # Get directory argument
    TARGET_DIR="$1"
    
    if [ -z "$TARGET_DIR" ]; then
      usage
    fi
    
    # Make sure the directory exists
    if [ ! -d "$TARGET_DIR" ]; then
      echo "Error: Directory '$TARGET_DIR' does not exist."
      exit 1
    fi
    
    # Confirm if not using -y
    if [ "$CONFIRM" = false ]; then
      read -p "Are you sure you want to delete the $NUM_FILES_TO_DELETE oldest files in '$TARGET_DIR'? [y/N]: " answer
      case "$answer" in
        [yY][eE][sS]|[yY])
          ;;
        *)
          echo "Operation cancelled."
          exit 0
          ;;
      esac
    fi
    
    # Delete oldest files
    find "$TARGET_DIR" -type f -printf '%T@ %p\n' | \
      sort -n | \
      head -n "$NUM_FILES_TO_DELETE" | \
      cut -d' ' -f2- | \
      xargs -d '\n' rm -f
    
    echo "Deleted the $NUM_FILES_TO_DELETE oldest files in '$TARGET_DIR'"

    It's run by cron every 30 minutes for the directories that seems to grow fastest and largest.

    */30 * * * * /usr/bin/env delete_oldest.sh -y /var/filecache/page > /dev/null
    */30 * * * * /usr/bin/env delete_oldest.sh -y /var/filecache/render > /dev/null
    */30 * * * * /usr/bin/env delete_oldest.sh -y /var/filecache/dynamic_page_cache > /dev/null
    */30 * * * * /usr/bin/env delete_oldest.sh -y /var/filecache/data > /dev/null

    /var/filecache/ is of course the location of the cache flies, and needs to be adjusted according to your setup.

    The script can also run from the command line as
    delete_oldest.sh /var/filecache/render

    My version deletes the oldest 1000 files, and for now that seems to tame the file numbers. I started out with 10000 to get the number of files reduced, but after a while it seems that 1000 is an OK number to cull.

    Martin

  • 🇨🇦Canada nubeli

    This appears to be a duplicate of https://www.drupal.org/project/filecache/issues/3001324 Limit number of cache files in cache bin directories Needs work . Martin, would you be interested in testing my patch in https://www.drupal.org/project/filecache/issues/3001324#comment-16058304 Limit number of cache files in cache bin directories Needs work to see if that helps? It splits the cache items into sub-directories within each cache bin directory. We've patched our sites with it, and it has made a dramatic difference. But I don't think any of the sites are as large as the one you describe.

Production build 0.71.5 2024