Allow image style derivatives to use a separate stream wrapper

Created on 14 April 2023, about 1 year ago
Updated 13 July 2023, 11 months ago

Problem/Motivation

As discussed in 📌 Make css/js optimized assets path configurable Fixed there are cases where sites might want image derivatives stored separately to their main public files directory. We could look at using the new assets:// stream wrapper or creating a new one.

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

📌 Task
Status

Closed: duplicate

Version

11.0 🔥

Component
Image module  →

Last updated 7 days ago

Created by

🇬🇧United Kingdom catch

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Issue created by @catch
  • 🇧🇪Belgium Wim Leers Ghent 🇧🇪🇪🇺

    +1!

    I think this is a blocker for S3FS?

  • 🇬🇧United Kingdom catch

    It doesn't block s3fs but it does help.

    A common use-case for s3fs is you have your site hosted where disk space is comparatively expensive (say pantheon) and you have a lot of uploaded files like images and videos. You can use s3fs' takeover public files setting and then host and serve all those images from files.example.com on S3 or an equivalent.

    However, this means core image derivative gets broken, so s3fs has complex logic rewriting image derivative URLS to ensure the image is served from PHP until it's on disk, then reverting to disk after that.

    If we add this to core, sites could put image derivatives back on their main install, meaning no URL rewriting to worry about. It also means no extra DNS requests for those images.

  • 🇮🇹Italy mondrake 🇮🇹

    Adding related

  • 🇺🇸United States cmlara

    As @catch note this is not a blocker for s3fs so I'm removing the Contributed Project Blocker tag.

    regarding #3027639-97: Make css/js optimized assets path configurable →

    You're right this cannot work for private://. But \Drupal\image\Entity\ImageStyle::fileDefaultScheme() currently returns either private or public. What would the harm be in making that private (unchanged!) or assets? 🤔

    The fileDefaultScheme() method is only actually used by read-only streamWrappers to determine where they should store files since they have no writable filesystem location, see @buildUri for more details.

    The most common example would be something like the remote_stream_wrapper http/https protocol end up on the default scheme, often 'public://', but sometimes 'private://' everything else ends up "placed on the same filesystem where the images originate." Changing just this methods return won't make public:// scheme image derivatives be stored on 'assets://'.

    Quoting myself from #3027639-91: Make css/js optimized assets path configurable →

    Beyond that image styles are a different cost equation (bandwidth transfer. computational cost, local disk storage, etc) compared to css/js file creation and should be given a much deeper discussion around DoS potentials, especially as you start discussing multi-server with load balancing proxies. We may very easily significantly weaken the ITOK protection system.

    My main concerns are that read-only streamWrappers and other remote streamWrappers are significantly "more expensive" (time/bandwidth) than local disk derivatives, and pose possible attack vectors that we don't normally consider with local disk storage. Even just calling file_exists() can have performance concerns when it comes to non local streamWrappers.

    Using rsw as an example: If it stores it files on assets:// and there are 10 Drupal fronted systems behind a load balancing proxy, it could be possible to force Drupal to generate the image derivative 10 times. If this is a "large" file (100mb for example) that could be up to 1gb of transfer bandwidth from just a dozen KB in http requests (a form of bandwidth amplification attack) and such an incident could occur anytime a frontend is rebuilt if it does not persist its asset storage (docker containers). Factor in that EVERY image could be a vector. This also ignores the storage amplification of having to store the same derivative multiple times and our possible need to implement cleanup for the image style directory as a 'temp' folder.

    Perhaps this could be mitigated by the load balancer always routing the request for the path to a specific frontend, or a CDN placed in front that caches the results and refuses to pass the request on however those would be protection outside of Drupal, I'm not sure we can call our deployment secure if it mandates external security to be configured.

    The other option is moving assets:// to a centralized storage, though that puts modules like s3fs right back where we already are, we just have one more scheme to deal with, and even worse we have the complexity of dealing with CSS/JS on-demand generation again. I personally would prefer a separate image_styles:// streamWrapper to make it easier to separate the two operations.

    I think the "how to handle private like schemes" should be discussed first, if all public derivatives went to 'image_styles://' and all private ones went to 'private://' or 'some-other-private://' based on the derivative source it feels weird to have two different standards on where to store derivatives.

  • 🇬🇧United Kingdom catch

    Just for flexibility, it seems worth having image_styles:// rather than adding images to assets:// - nothing stops a site pointing them to the same directory (or sub-directories of a parent directory) but the space considerations are definitely very different from css/js aggregates.

    I think the "how to handle private like schemes" should be discussed first, if all public derivatives went to 'image_styles://' and all private ones went to 'private://' or 'some-other-private://' based on the derivative source it feels weird to have two different standards on where to store derivatives.

    private:// is already well-understood as a special stream wrapper that's outside the filesystem, so forcing all 'private' things to go there is consistent with what we do now. All we're letting you do here is separate image styles from public:// It definitely needs explicit documentation that this is what's happening, but I don't think it would be weird.

  • 🇺🇸United States cmlara

    It definitely needs explicit documentation that this is what's happening, but I don't think it would be weird.

    Sounds like a fair method to handle this.

  • 🇧🇪Belgium Wim Leers Ghent 🇧🇪🇪🇺

    Agreed with #6 😊

  • Status changed to Closed: duplicate 11 months ago
  • 🇬🇧United Kingdom catch

    This is sadly a duplicate of ✨ Add new stream wrapper(s) to store generated files separately Needs work

Production build 0.69.0 2024