StreamWrapperInterface realpath() and dirname() return docblocs are is inconsistent with documented use, actual Core implementation, and intention

Created on 27 March 2022, over 2 years ago
Updated 22 July 2024, 4 months ago

Problem/Motivation

The implementation of StreamWrapperInterface::realpath() should document allowing returning BOOL(FALSE) and StreamWrapperInterface::dirname() should not document a FALSE return.

Core and contrib do return FALSE for realpath() as does PHP realpath() which the method is modeled after.
No functional core code returns FALSE for dirname(). PHP dirname() ALWAYS returns a string, and all core methods that call StreamWrapperInterface::dirname() expect only strings (no checking for FALSE) is done.

Steps to reproduce

Review StreamWrapperInterface.
Attempt to implement the interface on a stream wrapper Returning FALSE for realpath() gets flagged as an incompatible return type.

Proposed resolution

Update StreamWrapperInterface::realpath() doclbock @return to be string|BOOL. While only FALSE Is actually returned using BOOL keeps the value in line with the rest of core docblocks and is how Drupal Core has historically indicated a FALSE return.
Remove documentation about a FALSE return from StreamWrapperInterface::dirname()

Attempting to change the return type to FALSE would increase the scope of work and would be better handled by the PHPStan initiative, especially since Core to date has chosen to not implement L9 PHPStan scanning.

πŸ› Bug report
Status

Needs work

Version

11.0 πŸ”₯

Component
File systemΒ  β†’

Last updated about 11 hours ago

Created by

πŸ‡ΊπŸ‡ΈUnited States lhridley

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    Following up on this issue, which is now back on my radar screen.

    StreamWrappers for remote object stores cannot comply with creating a directory, as that concept does not exist for remote object stores. By limiting the Interface dirname() to require a value, you are effectively making the StreamWrapperInterface incompatible with remote object stores such as S3, Digital Ocean Spaces, Google's object store (don't remember the name currently), etc, requiring any contrib modules that allow utilizing a remote object store for file storage to write their own interfaces, which may or may not be compatible with Drupal's File system interfaces, which all originate with the StreamWrapperInterface.

    the S3 protocol complies with the php streamwrapper class, which also does not have the method dirname(), and the documentation indicates that calls to the class methods rmdir() and mkdir() should not be made if the wrapper does not support creating or removing directories.

    By not allowing FALSE as a return value, it significantly limits the use of the Core StreamWrapperInterface as a viable interface for alternative file storage mechanisms such as those that comply with the S3 protocol, which at its heart IS a streaming interface.

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    Good timing on bringing this up again, I was just going through another round of PHPSTAN L9 on the s3fs module and thinking "did we have an issue open yet for these incorrectly documented methods?"

    Actionable Concern:

    Really my ultimate concern in the proposed patch that sent this back to needs-work is as mentioned in #5 is dirname() and FileSystemInterface.

    If we change StreamWrapperInterface::dirname() to allow a false return (as the comment implies is allowed but the @return annotation does not) we have an issue with FIleSytemInterface::dirname()

    Here is the current implementation of FileSystem (inline comments my own)

     /**
       * Gets the name of the directory from a given path.
       *
       * PHP's dirname() does not properly pass streams, so this function fills that
       * gap. It is backwards compatible with normal paths and will use PHP's
       * dirname() as a fallback.
       *
       * Compatibility: normal paths and stream wrappers.
       *
       * @param string $uri
       *   A URI or path.
       *
       * @return string
       *   A string containing the directory name.
       *
       * @see dirname() // Note that php dirname() does NOT support returning FALSE.
       * @see https://www.drupal.org/node/515192
       * @ingroup php_wrappers
       */
      public function dirname($uri) {
        $scheme = StreamWrapperManager::getScheme($uri);
    
        if ($this->streamWrapperManager->isValidScheme($scheme)) {
          return $this->streamWrapperManager->getViaScheme($scheme)->dirname($uri); // This calls StreamWrapperInterface::dirname(). MR2017 suggests this be allowed to return FALSE, while  self::dirname() says false is not an approved return value. If Core ran a higher level of PHPstan this would of been flagged by automated testing.
        }
        else {
          return dirname($uri); // PHP dirname() ALWAYS returns a string.
        }
      }

    That all means we need to figure out FIleSystemInterface::dirname() do we add @return |FALSE here too? or do we need to have FileSystem::dirname() catch the FALSE return and convert it to a string of some form?

    Even if a storage doesn't support the concept of '/' being a directory, it could still return "scheme://" eg "http://www.example.org/dir/file1/file.txt" can still have a dirname() of "http://www.example.org/dir/" and get a useful outcome.

    This is why I think we might be better for dirname() to just remove the 'false' return comment and keep the return typehint as is, though that itself might be a BC break.

    More generic thoughts:

    On a more generic standpoint of dealing with these issues (that may be much wider scope than should be consider here)

    By not allowing FALSE as a return value, it significantly limits the use of the Core StreamWrapperInterface as a viable interface for alternative file storage mechanisms

    There is some validity to this.

    I'll note that the Drupal s3fs module has implemented dirname(), to my knowledge it ends up working since anywhere your working with /path/to/folder a streamWrapper works too, its only realpath() we return FALSE on. The aws-sdk-php doesn't implement it because PHP doesn't pass dirname() to streamWrappers. I will note \AWS\S3\StreamWrapper (used by s3fs) actually DOES have a concept of directories, even though S3 itself is a flat storage system.

    As you mention mkdir() in PhpStreamWraperInterface is a method that should NOT be implemented if the streamWrapper does not support it. Obviously that can not be done if you call 'implements PhpStreamWrapperInterface' which really means methods like this should NOT be on the base interfaces.

    Since items like dirname() and realpath() are our own methods they strictly don't have to follow PHP's requirements, however I would question the wisdom of them deviating too far from them, the more our methods are 'drop in' replacements the better it is for code compatibility.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley
  • πŸ‡ΊπŸ‡ΈUnited States lhridley
  • Pipeline finished with Failed
    8 months ago
    Total: 241s
    #133908
  • Pipeline finished with Failed
    8 months ago
    #133910
  • Pipeline finished with Success
    8 months ago
    Total: 582s
    #133915
  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    Updated the PR for the 10.3.x branch.

    Traced documentation and implementation of dirname() on FileSystemInterface, while not directly related to StreamWrapperInterface, is nonetheless relevant.

    All implementations of StreamWrapperInterface in core trace back to abstract class LocalStream, which handles (and can) returning a FALSE value on ::realpath(), so changing the documentation and return value typehinting on StreamWrapperInterface::realpath() to reflect a return value of FALSE.

    Re-reviewed the implementation of FileSystemInterface::dirname() as well as PHP's dirname(), neither will return FALSE. LocalStream::dirname() also does not return FALSE as well. Revised the return value comment to remove reference to a FALSE return value.

  • Status changed to Needs review 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States lhridley
  • Status changed to Needs work 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    I believe core would require this target 11.x first and be back ported to older branches (if eligible).

    It would be better do the rebase now to avoid this issue being delayed in a few months.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Can we take this further and add string|false a the typehint rather than just the docs?

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Ah, we can't add return typehints on interfaces without breaking bc.

  • Pipeline finished with Failed
    8 months ago
    Total: 171s
    #142298
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Created a new MR based on 11.x.

    I think we should go through the rest of the methods to ensure we have the correct documented return types.

  • Pipeline finished with Success
    8 months ago
    Total: 1122s
    #142303
  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    @kim.pepper -- I reviewed every defined in Drupal\Core\StreamWrapper\StreamWrapperInterface back in 2022 when I opened this issue. The two identified in this issue were the only ones whose documentation differed from the actual return type defined on the interface signature.

  • Status changed to Needs review 8 months ago
  • πŸ‡ΊπŸ‡ΈUnited States lhridley
  • Status changed to RTBC 7 months ago
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Thanks for confirming. This docs change looks good to me.

  • Status changed to Needs work 7 months ago
  • πŸ‡¬πŸ‡§United Kingdom alexpott πŸ‡ͺπŸ‡ΊπŸŒ

    \Drupal\file_test\StreamWrapper\DummyExternalReadOnlyWrapper::dirname() returns FALSE. I do not think we should be changing that part of the interface.

    We need to see what contrib is doing and a cursory search shows https://git.drupalcode.org/project/brandfolder/-/blob/2.0.x/src/StreamWr...

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    @alexpott Can you advice how you want this resolved given the points raised in #10 that the rest of core/contrib is not equipped to handle a FALSE return from dirname()?

    Sometimes the @return is wrong and sometimes the comments are wrong. In this case it appears to me the documentation is wrong for dirname() not the @return.

  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    @alexpott #25 we only ever return a string inline with the PHP dirname() function, so our documentation is wrong. The example you provided typehints the return type as bool which is wrong.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    @alexpott -- being able to return FALSE for dirname() would be awesome, especially for remote stream management to/from assets on object stores like S3. The original docblock notes indicated that both ::realpath() and ::dirname() werre supposed to be capable of returning a boolean, but the actual type hint on the return value as implemented were set for strings only.

    (Begin my soapbox contribution)

    IMO, we should not have either ::realpath() or ::dirname() on the StreamWrapperInterface, as the concept of both may not actually exist on the Streaming application that the wrapper is intended to integrate. For key/value object stores it's absolutely not applicable. If we ARE going to keep them on the StreamWrapperInterface, then we should make their implementation consistent with the PHP implementation of both:

    https://www.php.net/manual/en/function.realpath (returns string or FALSE)
    https://www.php.net/manual/en/function.dirname.php (returns string)

    PHP's dirname() and realpath() don't support "streaming" per se. They both are implicitely structured to work with a traditional, local file system. Implementing dirname() and realpath() on Drupal\Core\File\FilesystemInterface is appropriate (somewhat), but I still think even there they should return FALSE when they're not applicable (like streaming applications).

    This ticket, and the ensuing discussion, exposes what is IMO a fundamental flaw in the way the Drupal Core has implemented StreamWrappers. The implementation has in implicit assumption that the underlying roots of any StreamWrapper implementation is working with a traditional mounted file system. In practice, these two methods cause workarounds that have to be implemented for a non-traditional alternative to the more traditional mounted file system, such as remote object stores.

    As we have evolved other parts of Drupal Core to adapt to a constantly changing tech landscape, this is a particular component that needs to be revisited, as simple changes to the implementation open up a broader world of possibilities for more effectively using alternative file systems as they are intended to work. Breaking changes will likely result, but sometimes breaking changes are necessary to evolve.

    A more radical change would be to decouple Drupal\Core\StreamWrapper\StreamWrapperInterface from Drupal\Core\StreamWrapper\PhpStreamWrapperInterface. This could actually be done without alot of disruption simply by:

    • Making the two Wrapper Interfaces independent (in other words don't have StreamWrapperInterface implement PhpStreamWrapperInterface).
    • Deprecate the "extended" methods on StreamWrapperInterface that are actually a part of PhpStreamWrapperinterface, with messages to revise any custom implementation (custom code and contrib modules) to implement BOTH interfaces.
    • Pick a date for deprecations to take effect (11.x release perhaps)
    • Revise Core to decouple the two interfaces

    The code changes would be relatively minor, and this would make the StreamWrapper Management component of Drupal infinitely more flexible with regard to managing a wider variety of streaming services, including video streaming, podcasting, etc in addition to more traditional asset storage that is using non-traditional mounted file systems for storage (such as remote object stores, for example).

    (End my soapbox contribution)

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    being able to return FALSE for dirname() would be awesome, especially for remote stream management to/from assets on object stores like S3

    As an S3 streamWrapper maintainer (s3fs module) I'm not able to see any advantage to allowing FALSE, Can you please provide an example of how not having dirname() would help an object store streamWrapper to help me understand why this would be important?

    AWS PHP SDK StreamWrapper Interface: https://github.com/aws/aws-sdk-php/blob/master/src/S3/StreamWrapper.php <== neither method name exists
    Google Cloud Storage StreamWrapper class: https://github.com/googleapis/google-cloud-php-storage/blob/main/src/Str... <== neither method name exists

    It is important to note that no PHP streamWrapper ever implements these methods. See https://www.php.net/manual/en/class.streamwrapper.php

    I will admit I always found it a bit clumsy that in Drupal a streamWrapper class implements code intended for use by PHP directly and code that Drupal Core expects to call into directly rather than separating them into two different systems.

    That said, both methods were added to the Drupal API to allow more control rather than depending solely on PHP's implementation. This is especially true for PHP realpath() which will always return FALSE for a streamWrapper, even if its backed by local disk storage. Though there is a good argument that no Drupal streamWrapper should ever implement realpath() to ensure file access always goes through the streamWrapper.

    As an s3fs maintainer: we would have no issue with these methods being removed and relying on PHP implementations of these methods.

    I will note that if you absolutely need a flat structure this simple code would work for dirname()

    public function dirname($uri): string {
      return 's3://';
    }
    
  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    @cmlara You summed it up nicely:

    It is important to note that no PHP streamWrapper ever implements these methods. See https://www.php.net/manual/en/class.streamwrapper.php

    I will admit I always found it a bit clumsy that in Drupal a streamWrapper class implements code intended for use by PHP directly and code that Drupal Core expects to call into directly rather than separating them into two different systems.

    The example of how not having the ::dirname() defined on the StreamWrapperInterface is, a module maintainer doesn't have to fight with working out how to handle a method that is totally irrelevant to the functionality of a StreamWrapper, and write and maintain unnecessary code that contributes no value to the functional aspects of the StreamWrapper adapter that is being registered with Drupal's StreamWrapperManager service.

    I've been a maintainer of the Flysystem project for about 4 years, I got involved whenone of the maintainers left the community and we had several client projects that utilized Flysystem. The buggiest part of the initial implementation of Flysystem was the unnecessary methods such as dirname(), realpath(), and related methods that were also badly implemented. These have been refactored somewhat to make them better, but untangling the hacks that were put in place to force them to work is like trying to hook a race horse to a plow and expecting it to be effective :).

    Flysystem defines protocols in configuration, which for example will allow for different definitions of a stream wrapper adapter that implement different protocols (i.e, "docx://", "pdf://", "image://" etc. I can then define media entities that are configured to utilize different stream wrapper adapters that will store an entry in the file managed table utilizing the protocol defined for the different types.

    As a result, I can then send "docx://" to one S3 bucket, "pdf://" to a different S3 bucket, "image://" to Google Cloud Storage, and "styles://" to the local file system if my business use case necessitated such a structure. And I've used it to manage scientific studies that were PDFs living on different S3 buckets that were managed by the respective research teams in legacy systems outside of Drupal and uploaded to an S3 instance that they managed. We built a Drupal content management system to index the studies in Drupal and make them searchable with Solr. The team would then add the metadata about the study to the Drupal entry, and "publish" the study in Drupal, which would then in turn make it visible for Solr to index.

    We used Flysystem to manage the Drupal integration with the different S3 buckets.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    Additional info on contrib modules: Remote Stream Wrapper β†’ , which implements Drupal\Core\StreamWrapper\StreamWrapperInterface by extension. Looks like this module implemented StreamWrapperInterface::realPath() by the documentation and not by the interface defined return type. StreamWrapperInterface::dirname() is implemented by the defined return type, which is string.

    ::realPath
    Returns FALSE;
    https://git.drupalcode.org/project/remote_stream_wrapper/-/blob/2.x/src/...
    ::dirname
    Returns a string;
    https://git.drupalcode.org/project/remote_stream_wrapper/-/blob/2.x/src/...

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    Circling back for additional information in looking at the implementation of stream wrappers in Drupal core, as it is relevant to this discussion.

    Drupal core has three active stream wrappers currently registered:

    Two of the three stream wrappers extend the abstract class \Drupal\Core\StreamWrappers\LocalStream (PublicStream and TemporaryStream), and the third, AssetsStream, extends PublicStream.

    LocalStream::realpath() returns the result of LocalStream::getLocalPath(), which has return types of string|bool, and in code will return either FALSE, or a string..
    PublicStream::getLocalPath() overrides the parent (LocalStream::getLocalPath()) method, and also return either FALSE, or a string.
    TemporaryStream by inheritance implements the code defined in it's parent, LocalStream.

    In practice, it appears that Drupal Core's stream wrappers fail to adhere to the current Interface as it is described.

    In tracing the source of the implementation of StreamWrapperInterface::dirname() back through the same streamwrappers, all inherit from LocalStream::dirname(), which is defined with a return type of string on the interface, and adheres to that.

    Conclusion: the definition of StreamWrapperInterface::realpath() should be modified to implement return types of string|bool, as described in the current comments.

    This change will not break B/C in core, but should probably be identified in a release as a breaking change, as it could impact contrib modules that are using Drupal\Core\StreamWrapper\StreamWrapperInterface. Also worth noting, none of the stream wrappers in contrib modules that I've reviewed implement ::realpath() in adherance to the Interface, so they should in practice be fine.

  • Pipeline finished with Canceled
    7 months ago
    Total: 126s
    #147601
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    A typehint of string|false would be more accurate. We never return TRUE.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    lhridley β†’ changed the visibility of the branch drupal-3271894-3271894-streamwrapperinterface-return-types to hidden.

  • Status changed to Needs review 7 months ago
  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    Changing description of issue, since interface definition of return types on StreamWrapperInterface::dirname() is no longer relevant, and pursuant to the recently added comment #32 πŸ› StreamWrapperInterface return types for realpath() and dirname() are inconsistent with documented use and intention Needs work .

    @kim.pepper -- after looking at the return types traced back on the actual implementation of protected methods in LocalStream, I modified it for consistency.

    I have no strong feelings about string|bool vs string|FALSE, but changed the original PR for consistency with current implementations. I'm open to modifying this to string|FALSE, but would that not also mean that the return type for LocalStream::getLocalPath() would also need to be updated?

  • Pipeline finished with Success
    7 months ago
    Total: 1051s
    #147603
  • Pipeline finished with Success
    7 months ago
    Total: 1046s
    #147605
  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    smustgrave β†’ changed the visibility of the branch 3271894-streamwrapperinterface-return-types to hidden.

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    Went to review and hiding older branch.

    But seems needs clarity on string|bool vs string|FALSE

  • πŸ‡ΊπŸ‡ΈUnited States smustgrave

    So what's next steps needed for this one?

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    @smustgrave, thank you for asking.

    1. Clarity on whether the typehint needs to be string|bool vs string|FALSE
    2. If the decision is string|FALSE do we also need to revise the return types for LocalStream::getLocalPath() and PublicStream::getLocalPath() to string|FALSE as well.

    For clarity, the method ::getLocalPath() in the LocalStream and PublicStream classes are protected methods that are called directly by LocalStream::dirname() and PublicStream::dirname() but have a return type of string|bool instead of string|FALSE.

    At least, this is the next step from my vantage point.

  • Status changed to RTBC 6 months ago
  • πŸ‡¦πŸ‡ΊAustralia kim.pepper πŸ„β€β™‚οΈπŸ‡¦πŸ‡ΊSydney, Australia

    Thanks @lhridley for your detailed investigation. I'm happy with the changes in the MR.

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    A note from a PHPStan standpoint:

    @return string|bool
    Means that anyone calling the method needs to check for TRUE as well before passing the return to a function that only accepts strings.

    This means the classic if ($value === FALSE) {}
    check needs to usually be re-written toif (is_bool($value)) {} even though the comments only describe FALSE.

  • πŸ‡³πŸ‡ΏNew Zealand quietone

    I have read the issue summary, the comments and the MR. After reading the MR I see that the proposed resolution does not match the code changes.

    I am not sure about this.There is an instance in core that returns FALSE for dirname and this issue is removing mentioning that in the comment for the return. And the change to realpath allows boolean, but there is no implementation in core the returns TRUE. The change to the @return and the return comment should be in agreement.

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    Updated the IS to match the steps agreed in the issue and SLACK conversations.

    There is an instance in core that returns FALSE for dirname

    Our reviews did not find this in functional code. I've updated the IS based on the belief this isn't the case.

  • πŸ‡³πŸ‡ΏNew Zealand quietone

    I don't know what the historical information is that is being referred to. Using git I learned that the first use of 'sting|false' was in in 2012.

  • Status changed to Needs work 5 months ago
  • πŸ‡¬πŸ‡§United Kingdom alexpott πŸ‡ͺπŸ‡ΊπŸŒ

    The file that returns FALSE is core/modules/file/tests/file_test/src/StreamWrapper/DummyExternalReadOnlyWrapper.php it needs fixing if we're going to make this change.

  • πŸ‡ΊπŸ‡ΈUnited States lhridley

    The return type declared in LocalStream::getLocalPath() also needs to be redefined as string|FALSE. In reviewing the code, it does not appear that TRUE is ever returned, you either get a string, or a FALSE.

    Same with PublicStream::getLocalPath(), which inherits the doc comments from LocalStream::getLocalPath(), which it extends.

Production build 0.71.5 2024