NotOnServer only shows files if parent directory of missing file exists

Created on 10 September 2024, 11 months ago

Problem/Motivation

The "Not On Server" report won't show a file if the parent directory of the missing file doesn't exist.

Steps to reproduce

Upload a document with the following style of path, possibly using a media entity. I think the below is the default path generated for the "Document" media entity.

public://documents/2024-10/missing_filename.pdf

Delete only missing_filename.pdf then view the "Not On Server" report and the missing file will be shown.

Now delete either directory 2024-10 or documents and view the report again. The missing file won't appear in the report.

In our situation, we've deleted the documents directory as we didn't need any of them. Recreating that directory doesn't show what was in the subdirectories.

Proposed resolution

Find out why this is happening and fix it.

Remaining tasks

See above.

User interface changes

API changes

Data model changes

πŸ› Bug report
Status

Active

Version

4.2

Component

Not on server report

Created by

πŸ‡¦πŸ‡ΊAustralia imclean Tasmania

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @imclean
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania

    I suspect the culprit might be this line in Drupal\auditfiles\Auditor\AuditFilesNotOnServer:

    $target = $this->fileSystem->realpath($result->uri);
    

    This is converting a URI found in the file_managed table to a realpath, but it appears to return false if the file parent directory doesn't exist when it needs to return the full path to the missing file.

    For context, here's the entire method:

      public function getReferences(): \Generator {
        $maximumRecords = $this->auditFilesConfig->getReportOptionsMaximumRecords();
        $query = $this->connection->select('file_managed', 'fm');
        $query
          ->orderBy('changed', 'DESC')
          ->range(0, $maximumRecords)
          ->fields('fm', ['fid', 'uri']);
        /** @var array<object{uri: string, fid: string}> $results */
        $results = $query->execute()->fetchAll();
        foreach ($results as $result) {
          $target = $this->fileSystem->realpath($result->uri);
          if ($target !== FALSE && !file_exists($target)) {
            yield FileEntityReference::create((int) $result->fid);
          }
        }
      }
    
  • @imclean opened merge request.
  • Status changed to Needs review 11 months ago
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania

    Ref: https://www.php.net/manual/en/function.realpath.php

    Note:

    The running script must have executable permissions on all directories in the hierarchy, otherwise realpath() will return false.

    If a parent directory doesn't exist then realpath will return false, excluding it from this module's file_exists check and therefore not showing it in the "not on server report".

  • First commit to issue fork.
  • πŸ‡ΊπŸ‡¦Ukraine AstonVictor

    Hi there,

    I was able to reproduce the issue.
    I fixed merge conflicts and updated the MR

  • πŸ‡¬πŸ‡§United Kingdom jonathanshaw Stroud, UK

    +1

  • πŸ‡¬πŸ‡§United Kingdom jonathanshaw Stroud, UK
  • πŸ‡¦πŸ‡ΊAustralia imclean Tasmania

    @jonathanshaw I'm not sure what you mean, can you explain the exact scenario and how to replicate the problem?

  • πŸ‡¬πŸ‡§United Kingdom jonathanshaw Stroud, UK

    Actually I was too hasty. The problem with stream wrappers that don't support realpath already exists and this doesn't worsen it.

    But it could fix it:

    $base_dir = $this->fileSystem->realpath($scheme . '://');
    if ($base_dir) {
      $target = $base_dir . '/' . $path;
    }
    else {
      $target = $result->uri;
    }

    I'm not sure why we're using realpath here, but we should at least have a fallback when it's not available. My use case is s3fs.

  • πŸ‡¬πŸ‡§United Kingdom jonathanshaw Stroud, UK

    I've provided a fallback to manually convert paths to absolute, copying what s3fs decided to do in #3264640: S3fs does not handle paths with './' or '..' the same as core. β†’ . Sorry for somewhat hijacking this issue, hopefully you agree these concerns are closely entangled.

Production build 0.71.5 2024