Rewrite export.inc to avoid memory problems and timeout on export

Created on 14 August 2018, almost 6 years ago
Updated 7 May 2024, about 2 months ago

Although we have cleared up Core so far the German translation is still not exportable if you use the all releases merged option when Core is selected. It still leads to a 500 error. It would be great if this could be fixed.

Reproduce:

πŸ› Bug report
Status

Postponed: needs info

Version

3.0

Component

Code

Created by

πŸ‡©πŸ‡ͺGermany Joachim Namyslo Bayreuth πŸ‡©πŸ‡ͺ πŸ‡ͺπŸ‡Ί

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • Was able to reproduce it on current 3.0.x-dev and i'm working on a fix.

  • Unfortunately the export of a project, that consists of a large code base, like core, has major issues. The generation takes quiet some time and the produced
    translation files are very large when the Include metadata (Verbose output on 7.x-1.x)) option is enabled. They have a header that includes a Generated from files listing. These files are prefixed with the version string when releases are merged.

    On my local i've run several exports of the German translation of Core where almost all releases, that are listed on the core's update endpoint, have been parsed. That's ~630 releases, starting from 4.5.0.

    The results are:
    - All flags enabled (Download untranslated and translated strings + Inject German suggestions? + Include metadata):
    The generation took 36min and produced a 337MB file.
    The above mentioned file list is 724084 lines long where most of them look like this: # drupal-10.3.x-dev/core/themes/starterkit_theme/starterkit_theme.info.yml: n/a. I've seen only a few that look differently like that one: install.inc,v 1.24 2006/10/23 06:45:17 dries

    Every msgid / msgstr item has a reference to a source file and line number. When releases are merged there is a reference to for every release. That makes them really long (multiple thousand characters). E.g. #: core/authorize.php:146; core/lib/Drupal/Core/Updater/Module.php:130; core/lib/Drupal/Core/Updater/Theme.php:110; ...

    Just in case someone is interested, i've attached the compressed file.

    - When the Include metadata option was unchecked (and Download untranslated and translated strings + Inject German suggestions? where checked) the generation took 24.69min and produced a 3.9MB file.

    - The fastest export took 11.82min and produced a 311.8MB file (with Download only translated strings + Include metadata checked and unchecked Inject German suggestions?) / 13.8min with checked Inject German suggestions? and produced a 315.5MB file

    ---

    That's on modern hardware (Zen 3 CPU + NVMe), running on Linux.

    I've never used a local translation application but can hardly imagine that this amount of data is actually useful. I wonder what the value of having these file listing is!? Can those large files even processed by local translation applications? Are they able to handle / visualize that many references? Do they remove the metadata? If not the files are too large for an import (file size limit is 50MB on production).

    The batch rewrite is still (probably) required, because the timeout happens when a single release is exported too, but i think we should limit the All releases merged option to projects that are below a specific threshold (e.g. a line count and / or release count).

  • Status changed to Postponed: needs info about 2 months ago
Production build 0.69.0 2024