JSON API endpoints return HTTP 500 errors with "Malformed UTF-8 characters, possibly incorrectly encoded" when serving content that contains French accented characters (à, é, è, ç, etc.). The error occurs specifically in the Symfony JsonEncoder during response serialization.
Root Cause: PHP's PCRE functions (preg_replace, preg_match, etc.) are not UTF-8 aware by default. When these functions process strings containing multibyte UTF-8 characters without the u (unicode) modifier, they treat each byte separately instead of as complete UTF-8 characters, corrupting the encoding.
Impact:
The UTF-8 sequence for 'à' (\xC3\xA0) gets corrupted when processed by preg_replace('/\s+/', ' ', $value) without the unicode modifier, making the string invalid UTF-8 and causing json_encode() to fail during JSON API response generation.
Patch following.
Active
2.1
Code