"missing required fields" error caused by Window/Microsoft special encoding characters

Created on 16 December 2022, about 2 years ago
Updated 16 February 2023, almost 2 years ago

Problem/Motivation

When attempting to import a CSV produced via a Winblows and Microcrap product, it's likely that non-visible non-UTF8 codes are included at the front of the file. When importing the CSV, these codes are interpreted as characters and get in the way of detecting required fields. There are various techniques that might be used by Windows users to fix what their programs did wrong, but this can be a big ask in some cases, and it's also not obvious that they're the source of the issue when it's the `csv_importer` module that's reporting a "missing required field" that is very clearly in the file being imported.

Steps to reproduce

  1. Create a spreadsheet in Excel in Windows (in my client's case, using Microsoft® Excel® for Microsoft 365 MSO (Version 2202 Build 16.0.14931.20806) 64-bit on Microsoft 365 OS)
  2. Place "title" in the first column, and the fill in the rest of the columns needed for a Content Type.
  3. Export to CSV format by choosing Save As and picking CSV UTF-8 (comma delimited) (*.csv)
  4. Import the CSV file in to the Content Type in the usual way via `csv_importer`.
  5. Observe that an error occurs; i.e. "Your CSV has missing required fields: title"

Proposed resolution

Apply the provided patch.

Remaining tasks

n/a

User interface changes

n/a

API changes

n/a

Data model changes

n/a

🐛 Bug report
Status

Active

Version

1.14

Component

Code

Created by

🇺🇸United States emanaton

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

  • 🇬🇪Georgia lashabp

    @emanaton Could u please provide the example file? i mean "broken" one.. thanks.

  • 🇺🇸United States emanaton

    Not only will I provide you with a problematic file, I'll give you yet another patch! My users are now including characters that cause whole fields not to be imported. In this case, it's the "OPA Findings" column that's borked - revery record past the third row comes in with that field blank.

    The attached patch is for the current version of the module (i.e. 8.x-1.15), and cleanses both the headers and the fields.

  • 🇺🇸United States emanaton

    Stick with me -- I'll get this right yet! I had a request come down that the various windows characters be preserved, so I've added yet another line to the mix to cover that.

Production build 0.71.5 2024