REST views: double encoding of apostrophes in REST Export display

Created on 6 December 2017, over 7 years ago
Updated 31 January 2023, over 2 years ago

In the below sample of a REST export view output in JSON format, you can see that an apostrophe character (ASCII code 39) is double encoded in the form of "\u0026#039;".

[{"book_background_pattern":"\/sites\/default\/files\/a_visit_to\/background_images\/avt_s18_background_pattern.jpg","cover":"\/sites\/default\/files\/a_visit_to\/background_images\/avt_doc_S18_cover.jpg","dark_color":"0073b9","accent_color":"a92825","light_color":"c7d5ee","header_background":"\/sites\/default\/files\/a_visit_to\/background_images\/avt_s18_header.png","title":"The Doctor\u0026#039;s Office: A 4D Book","vuforia_device_database":"\/sites\/default\/files\/a_visit_to\/doctors_office\/targets\/a_visit_to_doctors_office.zip","id":"8799","author":"Blake A. Hoena","illustrator":"","series":"A Visit to...","series_id":"268"}]

Steps to reproduce:

  1. Create a node with a title containing an apostrophe character.
  2. Create a view containing a REST Export display.
  3. Set the view format to "Fields."
  4. Add the "Content:Title" field to the field list.
  5. Preview the results of the view.
  6. Observe that the apostrophe character is double encoded.
🐛 Bug report
Status

Active

Version

9.5

Component
REST 

Last updated 10 days ago

Created by

🇺🇸United States alex.stone.filament

Live updates comments and jobs are added and updated live.
  • VDC

    Related to the Views in Drupal Core initiative.

  • Needs tests

    The change is currently missing an automated test that fails when run with the original code, and succeeds when the bug has been fixed.

Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇩🇪Germany internetter Erfurt, Thüringen

    I discovered other problems with rext export of views and image urls. Perhaps it is related:

    There was an encoding of parameter ampersand "&" as "\u0026amp;" for multiple parameter urls from image url formatter (using of focal_point).

  • Status changed to Needs review about 2 years ago
  • Open in Jenkins → Open on Drupal.org →
    Environment: PHP 8.1 & MySQL 5.7
    last update about 2 years ago
    29,366 pass
  • 🇩🇰Denmark ressa Copenhagen

    Thanks @RichardDavies! Your patch works perfectly in Drupal 10, leaving single quotes (') be, and the output a lot cleaner. Before and after:

    • "name": "C\u00f4te d\u0026#039;Ivoire",
      "name": "Côte d'Ivoire"
      
    • "name": "Pes\u00e4pallo",
      "name": "Pesäpallo"
      

    Also, much cleaner looking HTML (before and after):

    • "title": "Facts about C\u00f4te d\u0026#039;Ivoire"
      "title": "Facts about Côte d'Ivoire"
      
    • "field_body": "\u003Ch2\u003E1. Here are some facts\u003C\/h2\u003E ..."
      "field_body": "<h2>1. Here are some facts<\/h2> ..."
      

    There's also the related issue 🐛 single quote character not escaped in REST output Active about single quotes (') which I believe don't need to be HTML encoded into &#039;, since single quotes don't need escaping because proper JSON output is in double quotes.

    Should it be looked at here, or in the other issue?

    I am attaching a re-rolled patch for Drupal 10.1, since I have bad experiences with re-basing Drupal core MR's in Drupal's Gitlab. Also, this patch can then be used as a patch in Composer, since it is static.

  • 🇩🇰Denmark ressa Copenhagen

    Also, fixing 🐛 Allow JSON format when "Accepted request formats" is not defined Active would get REST and Views export in a great state, working out-of-the-box.

  • Status changed to Needs work about 2 years ago
  • 🇺🇸United States smustgrave

    Can the issue summary be updated to include the proposed resolution.

    Also a test showing the problem will be needed please

    Thanks!

  • 🇩🇰Denmark ressa Copenhagen

    Thanks for reviewing it @smustgrave. I would also be interested in a description of what the regex actually does. @RichardDavies: Perhaps you can help with this?

  • 🇩🇰Denmark ressa Copenhagen

    I also now see that the preview is still escaped, so we probably should do the same there? I'll add the tasks in the issue summary.

  • 🇺🇸United States RichardDavies Portland, Oregon

    @ressa The regex searches the $output string for all occurrences of \\uXXXX where X is a hexadecimal character consisting of a digit 0-9 or letter A-F (case insensitive). e.g. \\u0026

    For each match that it finds, it uses the mb_convert_encoding() function to convert that character from one encoding to another encoding. Then any double quote characters (") are prefixed with a slash character (\) so that they're properly escaped according to the JSON string requirements.

  • 🇩🇰Denmark ressa Copenhagen

    Thanks @RichardDavies! Both for working on this solution, and explaining the regex. I have added it in the Issue Summary.

  • 🇫🇷France nicolasgraph Strasbourg

    Patch #86 causes malformed UTF-8 characters for emojis.

  • 🇵🇭Philippines _renify_ cebu
Production build 0.71.5 2024