Unfortunately the export of a project, that consists of a large code base, like core, has major issues. The generation takes quiet some time and the produced
translation files are very large when the Include metadata (Verbose output on 7.x-1.x)) option is enabled. They have a header that includes aGenerated from files
listing. These files are prefixed with the version string when releases are merged.On my local i've run several exports of the German translation of Core where almost all releases, that are listed on the core's update endpoint, have been parsed. That's ~630 releases, starting from 4.5.0.
The results are:
- All flags enabled (Download untranslated and translated strings + Inject German suggestions? + Include metadata):
The generation took 36min and produced a 337MB file.
The above mentioned file list is 724084 lines long where most of them look like this:# drupal-10.3.x-dev/core/themes/starterkit_theme/starterkit_theme.info.yml: n/a
. I've seen only a few that look differently like that one:install.inc,v 1.24 2006/10/23 06:45:17 dries
Every
msgid
/msgstr
item has a reference to a source file and line number. When releases are merged there is a reference to for every release. That makes them really long (multiple thousand characters). E.g.#: core/authorize.php:146; core/lib/Drupal/Core/Updater/Module.php:130; core/lib/Drupal/Core/Updater/Theme.php:110; ...
Just in case someone is interested, i've attached the compressed file.
- When the Include metadata option was unchecked (and Download untranslated and translated strings + Inject German suggestions? where checked) the generation took 24.69min and produced a 3.9MB file.
- The fastest export took 11.82min and produced a 311.8MB file (with Download only translated strings + Include metadata checked and unchecked Inject German suggestions?) / 13.8min with checked Inject German suggestions? and produced a 315.5MB file
---
That's on modern hardware (Zen 3 CPU + NVMe), running on Linux.
I've never used a local translation application but can hardly imagine that this amount of data is actually useful. I wonder what the value of having these file listing is!? Can those large files even processed by local translation applications? Are they able to handle / visualize that many references? Do they remove the metadata? If not the files are too large for an import (file size limit is 50MB on production).
The batch rewrite is still (probably) required, because the timeout happens when a single release is exported too, but i think we should limit the All releases merged option to projects that are below a specific threshold (e.g. a line count and / or release count).
- Status changed to Postponed: needs info
7 months ago 3:44pm 7 May 2024 - Issue was unassigned.