If the XML source contains a DEFAULT namespace, e.g:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.a-domain.com">
<contenitor>
<article-node>
<article-node__title>Article title</article-node__title>
<article-node__body>Article body</article-node__body>
</article-node>
</contenitor>
<another-different-contenitor>
<article-node>
<article-node__title>Article 2 title</article-node__title>
<article-node__body>Article 2 body</article-node__body>
</article-node>
<article-node>
<article-node__title>Article 3 title</article-node__title>
<article-node__body>Article 3 body</article-node__body>
</article-node>
</another-different-contenitor>
</root>
The item_selector:
or selector:
xPath query does not work (because the default namespace needs to be registered before the xPath query, e.g. using $xpath->registerNamespace('root', 'http://www.a-domain.com');
)
Request
1) a code improvement to automatically manage such case
or
2) A way to declare/register a proper default namespace within the yml (e.g. default_namespace: <value>
)
Actual workarounds:
1) Manually remove the default namespace from the XML source (programmatically this is not simple to do because namespaces are not "normal" attributes)
2) If previous workaround is not feasible you can change xPath queries in the yml as described in https://www.palantir.net/blog/migrating-xml-drupal-8
example (based on XML above):
Change from this:
...
source:
plugin: url
data_fetcher_plugin: file
# simple_xml used here instead of xml because it supports xpath better:
data_parser_plugin: simple_xml
urls: ./modules/custom/your-module/data/source.xml
item_selector: //article-node
fields:
-
name: article_title
label: 'Title'
selector: article-node__title
...
To this:
...
source:
plugin: url
data_fetcher_plugin: file
# simple_xml used here instead of xml because it supports xpath better:
data_parser_plugin: simple_xml
urls: ./modules/custom/your-module/data/source.xml
item_selector: '//*[local-name()="article-node"]'
fields:
-
name: article_title
label: 'Title'
selector: '*[local-name()="article-node__title"]'
...