Text parts with markup lost with Google API

Created on 26 May 2025, 5 days ago

Problem/Motivation

Using the Google API translation backend, some content gets lost.
I cannot really pin down when this happens.

However, there seems to be a reliable case which can be used for reproduction.

Steps to reproduce

  1. Enter the following into the HTML source of a body field of a basic page (that's configured to be translatable):
    <p>
        hier kommt content her, ggf auch mit markup:
    </p>
    <ol>
        <li>
            eins
        </li>
        <li>
            zwei
        </li>
        <li>
            drei
        </li>
    </ol>
    <p>
        &nbsp;
    </p>
    <p>
        weiters hier noch <strong>fett</strong>, und wird das darüber in der Liste nicht übersetzt?
    </p>
  2. Save.
  3. Translate to English.

The following is the result (HTML tags are not escaped, i.e. they are correctly interpreted; this the source code visible in CKEditors source editing mode):

<p>
    Here comes content, possibly also with Markup:
</p>
<ul>
    <li>
        eins
    </li>
    <li>
        &nbsp;
    </li>
    <li>
        &nbsp;
    </li>
</ul>
<p>
    &nbsp;
</p>
<p>
    Furthermore <strong>fat </strong>, and is that not translated in the list?
</p>

Note how "eins" (one) is not translated, "zwei" (two) and "drei" (three) are even gone, but markup and translation is correct otherwise.

I tried to drill down as far as I could into the result sent back by the API, but it seems this really is the content sent back, but I may be wrong.
Maybe there needs to be some kind of encoding before and decoding afterwards to make it work correctly.

🐛 Bug report
Status

Active

Version

1.4

Component

Code

Created by

🇦🇹Austria tgoeg

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Production build 0.71.5 2024