Bad performance with large private video playback on S3

Created on 8 January 2023, almost 2 years ago
Updated 18 May 2023, over 1 year ago

Problem/Motivation

Large video files take a long time to load and stagger when stored as "private" on S3.

Steps to reproduce

  1. Install S3FS module
  2. Create an S3 bucket on S3
  3. In settings.php Setup the private files to point to S3:
  4.   $settings['s3fs.upload_as_private'] = TRUE;
      $settings['s3fs.use_s3_for_private'] = TRUE;
  5. Create a content type with 2 fields: a public File field with Upload destination: "S3 File System" and a second File field with Upload destination: "Private files (s3fs)".
  6. Create node and upload the same large video file to both destinations.
  7. View video. The public one will load fast and the private one will take some time to load. Also skipping within the video takes time with the private one.

An example of identical files stored in the same S3 bucket: one public and one private can be seen here:
Public:
https://drupal-test-s3.s3.eu-west-1.amazonaws.com/2023-01/test1_0.mp4
https://drupal9-t1.webaxy.com/system/files/2023-01/test1.mp4

This is a 1.5GB file, which runs immediately when viewing directly from S3 and takes over 20 seconds to load when defined as a private S3FS file.

Any solution for this issue?

πŸ’¬ Support request
Status

Closed: won't fix

Version

3.1

Component

Code

Created by

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Some more testing.
    I opened: https://drupal9-t1.webaxy.com/system/files/2023-01/test1.mp4 on Firefox in a "Private Window" from a remote location.

    1. On the Drupal EC2 server I can see two files being downloaded simultaneously each 1.6GB. (in the screenshot you can see them still being downloaded):

    2. Once downloaded the video starts streaming, the two first files are deleted from disk and but the stream comes from a third "get" that starts downloading a third file:

    You can see from the browser network analysis the time it took the first two files to download (over 21 seconds) and then the video started to stream ("receiving" mode) when the third file starts downloading:

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    Interesting.

    I loaded your test file and can confirm your observations.

    The second request is a ranged request to near the end of the file Range bytes=1694629888- as highlighted in your photo it took an extended period of time (19 seconds in my test). This would presumably be the time it takes Drupal to download the file from the bucket up to that range mark before it can start streaming the result to the client.

    The 1st request is from the very beginning of the file and the 3rd request is ranged to be near the start of the file and as such did not have a significant delay.

    I didn't see multiple request occur when testing my own random mp4 files which leads me to suspect this is likely related to the encoding/containerization of the video files. Containerization of video is a fairly complex subject and I recall this being a common issue when I was responsible for deploying web filtering appliances.

    Without actually analyzing the files I suspect that behavior difference can be explained by the samples I used were likely packaged to be more stream friendly while the sample you are working with are packaged with the metadata towards the end.

    Given the above scenario s3fs using private wouldn't really ever be able to make that second step quick unless it was caching and the file was still 'hot' having been fully downloaded previously and saved locally because the client needs the data and we have to download it from the bucket, even the first request would not be sufficient for that as only a small portion is downloaded.

    That would appear to leave you with your original suggestion to use private:// from local disk, my original suggestion to investigate a custom controller with pre-signed redirects, or new option to investigate re-encoding/re-packaging the media files to make them more 'stream' friendly (though that wouldn't solve seeking speed it can solve the time to first frame)

  • Thanks for testing and all the insights.
    I have decided to move on to what seems like a better solution for serving S3 files.
    I have implemented an S3 File Gateway which allows mounting S3 buckets as local volumes on EC2.
    This has basically solved all issues since it allows treating S3 buckets as a local FS.
    It does require a separate EC2 server to provide the gateway/cache/shares and may not fit for small scale solutions, however since we wish to use S3 storage extensively, this seems like the way to go.

  • Status changed to Closed: won't fix over 1 year ago
  • Closing this, as there are no additional comments.

Production build 0.71.5 2024