Downloading submission csv and .zip file archive incomplete.

Created on 23 June 2023, over 1 year ago
Updated 24 June 2024, 6 months ago
πŸ› Bug report
Status

Needs review

Version

6.2

Component

Code

Created by

πŸ‡¨πŸ‡¦Canada drupalthings

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • πŸ‡ΊπŸ‡ΈUnited States jrockowitz Brooklyn, NY

    Are you using a load balancer?

  • Status changed to Postponed: needs info over 1 year ago
  • Status changed to Active over 1 year ago
  • πŸ‡¨πŸ‡¦Canada drupalthings

    No, only a single windows 10 dev machine running an Ubuntu 20.04 install. Drupal and mysql installed on the ubuntu vm with the public and private filesystem located on the Ubuntu file system. No load balancing.

  • How can we set up the precise situation to reproduce this?

  • πŸ‡¨πŸ‡¦Canada drupalthings

    Good Point. If you accept the premise that the archive needs to be closed before it is opened again, you could put in some print statements to the logfile each time the archive is created and opened and a file is added. You would need a webform with at least a single file upload field that accepts image files (probably any large file would do). You could get away with a smaller batch size if you changed the minimum size of the batch to less than 100 (I think it might be hardcoded as such). It's interesting to note in the documentation and comments on zipArchive at https://www.php.net/manual/en/ziparchive.close.php#93322.

    Pay attention, that ZipArchive::addFile() only opens file descriptors and does not compress it. And only ZipArchive::close() compress file and it takes quite a lot of time. Be careful with timeouts.

    This little program creates some temporary 2MB files and uploads them in two batches of 10 files each. On my system at least, only the last 10 get added if I comment out the close. If I close the file after each batch all 20 get added:

    <?php
    class BatchArchive
    {
        private $zip;
        private $archiveName;
    
        public function __construct($archiveName)
        {
            $this->archiveName = $archiveName;
            $this->zip = new ZipArchive();
    
            $flags = (file_exists($this->archiveName)) ? NULL : ZipArchive::CREATE | ZipArchive::OVERWRITE;
            echo "Opening archive {$this->archiveName} with flags: {$flags}\n";
            if ($this->zip->open($this->archiveName, $flags) !== true) {
                die("Failed to create or open archive: $this->archiveName");
            }
        }
    
        public function addBatchToArchive($files)
        {
            foreach ($files as $file) {
                $localName = basename($file);
                $this->zip->addFile($file, $localName);
                echo "Added file to archive: $localName\n";
            }
        }
    
        public function closeArchive()
        {
            //$this->zip->close();
            //echo "Archive closed: $this->archiveName\n";
        }
    }
    
    $numberOfFiles = 20;
    $filesPerBatch = $numberOfFiles / 2;
    
    // Create temporary files
    $tempFiles = [];
    for ($i = 1; $i <= $numberOfFiles; $i++) {
        $tempFileName = tempnam(sys_get_temp_dir(), 'tempfile' . $i);
        $fileSize = 2 * 1024 * 1024; // 2MB
        $randomData = openssl_random_pseudo_bytes($fileSize);
        file_put_contents($tempFileName, $randomData);
        $tempFiles[] = $tempFileName;
        echo "Created temporary file: $tempFileName\n";
    }
    
    // Divide files into batches
    $batches = array_chunk($tempFiles, $filesPerBatch);
    
    // Create batch objects and add batches to archive
    $archiveName = 'archive.zip';
    if (file_exists($archiveName)) {
        unlink($archiveName);
        echo "Deleted existing archive: $archiveName\n";
    }
    
    foreach ($batches as $batch) {
        echo "Adding batch to archive\n";
        $batchArchive = new BatchArchive($archiveName);
        $batchArchive->addBatchToArchive($batch);
        $batchArchive->closeArchive();
    }
    ?>

    And the output with the close archive commented out is:

    dpostle-tech@drupal-develop:~/projects/temp$ php test9.php
    Created temporary file: /tmp/tempfile1jeztvO
    Created temporary file: /tmp/tempfile2OfQlwM
    Created temporary file: /tmp/tempfile3raKBLO
    Created temporary file: /tmp/tempfile42DgOhM
    Created temporary file: /tmp/tempfile5phKybP
    Created temporary file: /tmp/tempfile6qlSfsP
    Created temporary file: /tmp/tempfile7mzDsHM
    Created temporary file: /tmp/tempfile8jTrHKN
    Created temporary file: /tmp/tempfile9P5qLGM
    Created temporary file: /tmp/tempfile10H0hH2O
    Created temporary file: /tmp/tempfile11VKTCdP
    Created temporary file: /tmp/tempfile12CnPzUL
    Created temporary file: /tmp/tempfile13Uz0dxN
    Created temporary file: /tmp/tempfile14BEo6SM
    Created temporary file: /tmp/tempfile15wyjHFM
    Created temporary file: /tmp/tempfile16poB9cP
    Created temporary file: /tmp/tempfile17epq1yM
    Created temporary file: /tmp/tempfile180irQ0O
    Created temporary file: /tmp/tempfile19yr5BnO
    Created temporary file: /tmp/tempfile20SK5v3M
    Deleted existing archive: archive.zip
    Adding batch to archive
    Opening archive archive.zip with flags: 9
    Added file to archive: tempfile1jeztvO
    Added file to archive: tempfile2OfQlwM
    Added file to archive: tempfile3raKBLO
    Added file to archive: tempfile42DgOhM
    Added file to archive: tempfile5phKybP
    Added file to archive: tempfile6qlSfsP
    Added file to archive: tempfile7mzDsHM
    Added file to archive: tempfile8jTrHKN
    Added file to archive: tempfile9P5qLGM
    Added file to archive: tempfile10H0hH2O
    Adding batch to archive
    Opening archive archive.zip with flags: 9
    Added file to archive: tempfile11VKTCdP
    Added file to archive: tempfile12CnPzUL
    Added file to archive: tempfile13Uz0dxN
    Added file to archive: tempfile14BEo6SM
    Added file to archive: tempfile15wyjHFM
    Added file to archive: tempfile16poB9cP
    Added file to archive: tempfile17epq1yM
    Added file to archive: tempfile180irQ0O
    Added file to archive: tempfile19yr5BnO
    Added file to archive: tempfile20SK5v3M
    
    dpostle-tech@drupal-develop:~/projects/temp$ unzip -l archive.zip
    Archive:  archive.zip
      Length      Date    Time    Name
    ---------  ---------- -----   ----
      2097152  2023-06-27 13:58   tempfile11VKTCdP
      2097152  2023-06-27 13:58   tempfile12CnPzUL
      2097152  2023-06-27 13:58   tempfile13Uz0dxN
      2097152  2023-06-27 13:58   tempfile14BEo6SM
      2097152  2023-06-27 13:58   tempfile15wyjHFM
      2097152  2023-06-27 13:58   tempfile16poB9cP
      2097152  2023-06-27 13:58   tempfile17epq1yM
      2097152  2023-06-27 13:58   tempfile180irQ0O
      2097152  2023-06-27 13:58   tempfile19yr5BnO
      2097152  2023-06-27 13:58   tempfile20SK5v3M
    ---------                     -------
     20971520                     10 files
    dpostle-tech@drupal-develop:~/projects/temp$

    The key point here is even though the first batch was added to the file, it didn't exist and so was created again when adding the second batch

  • πŸ‡¨πŸ‡¦Canada drupalthings

    As for a precise situation for reproducing, I was able to reproduce using a simple test form with one field for uploading files.

    1. I used the test functionality of the form to create 12 forms,
    2. I then created a temporary file of 2MB and copied it to the private/default/webforms/test_downloads submission directories for each uploaded file,
    3. I changed "Batch Export Size" to 10 under Configuration->Advanced webform settings (the min size is not limited to 100, I misread that),
    4. made sure the tmp directory was clear of any partial archives,
    5. Used drush to clear the cache and downloaded the results with "Download uploaded files" checked.

    There appears to be a timing aspect related to the size of the uploaded files -- I couldn't get it to happen with the tiny uploaded files created by the test form tab without replacing the uploaded files with the larger 2MB versions.

  • πŸ‡¨πŸ‡¦Canada hargurpreet Kitchener

    I have also experienced the same issue while exporting the both submissions and uploaded files. To fix it, I have created this patch which works fine for me. Thanks!

  • πŸ‡ΊπŸ‡ΈUnited States jrockowitz Brooklyn, NY
  • πŸ‡ΊπŸ‡ΈUnited States fizcs3 Omaha, Nebraska; USA

    We did also have an very odd issue downloading a zip archive whereas it would skip submissions. It didn't happen for all webforms, and really can't characterize it any better than it involved choosing:
    * Export Format: PDF documents
    * Download uploaded files: checked/yes

    All I can say is applied the small patch in #8 and it fixed it.
    Am confirming the patch applies on Drupal 10.2.6 with webform 6.2.2.
    Thank you @hargurpreet

  • Status changed to Needs review 6 months ago
  • πŸ‡ΊπŸ‡ΈUnited States jrockowitz Brooklyn, NY

    I am unsure if the patch is getting to the root cause. It seems that the archive is being closed and needs to be reopened when zipping large files.

    @see https://stackoverflow.com/questions/16121885/php-zip-archive-memory-ram-...

    I would be more comfortable with a patch that checks if the archive is closed and then reopens it. The current patch is reopening the archive with every file.

Production build 0.71.5 2024