All my migrations previously worked with XML files encoded in UTF-16LE but were suddenly broken after upgrading to Migrate Plus 4.2.
Drupal\migrate\MigrateException: Fatal Error 73: expected '>'
Line: 542
Column: 20
File: in Drupal\migrate_plus\Plugin\migrate_plus\data_parser\SimpleXml->openSourceUrl() (line 51 of modules/contrib/migrate_plus/src/Plugin/migrate_plus/data_parser/SimpleXml.php).
It turns out that the issue
#3046753 Make XML parser more resilient →
introduced a call with trim()
before simplexml_load_string()
protected function openSourceUrl($url) {
// Clear XML error buffer. Other Drupal code that executed during the
// migration may have polluted the error buffer and could create false
// positives in our error check below. We are only concerned with errors
// that occur from attempting to load the XML string into an object here.
libxml_clear_errors();
$xml_data = $this->getDataFetcherPlugin()->getResponseContent($url);
$xml = simplexml_load_string(trim($xml_data));
foreach (libxml_get_errors() as $error) {
$error_string = self::parseLibXmlError($error);
throw new MigrateException($error_string);
}
$this->registerNamespaces($xml);
$xpath = $this->configuration['item_selector'];
$this->matches = $xml->xpath($xpath);
return TRUE;
}
The function trim()
is not safe when working with multibyte encoded string, whereas SimpleXML can perfectly handle multibyte data. I don't think it necessary to call trim()
before simplexml_load_string
. If your XML has an empty line before the openning tag, your XML is not well-formed and required special treatment. Adding trim()
to the generic parser will prevent it from working properly with Unicode data.
Needs review
6.0
Plugins
Not all content is available!
It's likely this issue predates Contrib.social: some issue and comment data are missing.