Problem/Motivation
When uploading files with accented or special characters in their filenames (e.g., Mönchengladbach.png), Drupal produces unexpected transliterations. Example:
- Original filename: Mönchengladbach.png
- After upload (with transliteration enabled): Monchengladbach.png
In a properly configured German environment, the expected transliteration would be Moenchengladbach.png.
The root problem persists even with transliteration enabled, because filenames provided by PHP during upload may already be normalized to Unicode NFD (decomposed characters). In decomposed form, "ö" is stored as "o" plus a combining accent character. The combining accent is lost or mishandled during processing, leading to incorrect transliteration results.
Steps to reproduce
- Ensure "Transliterate file names on upload" is enabled at:
/admin/config/media/file-system
- Upload a file named Mönchengladbach.png from a macOS machine using Finder or Safari (default NFD normalization applies).
- Inspect the resulting filename in the Drupal system — it becomes Monchengladbach.png instead of the expected Moenchengladbach.png.
Proposed resolution
Enforce Unicode NFC normalization on uploaded filenames before transliteration is applied, for example:
if (class_exists('Normalizer') && !\Normalizer::isNormalized($string, \Normalizer::FORM_C)) {
$string = \Normalizer::normalize($string, \Normalizer::FORM_C);
}
This ensures combining accents are properly merged with base characters, enabling transliteration mappings to work as intended.
Additional Notes
- Tested with Drupal 11, PHP 8.3+
- Verified using macOS Finder, Firefox, and Chrome uploads.
- PHP intl extension is required for reliable Unicode normalization.