You were right, the original schema was not updated! Only people with an old version that run update hook got the schema fixed. Thanks for the fix. I added the unique key constraint that was also missing in your patch (add it to your instance as well ;) ).
Please confirm it works for you now with this fix.
guignonv → changed the visibility of the branch 3.0.x to active.
guignonv → changed the visibility of the branch 3.0.x to hidden.
The fix seems correct so far, but I need to double check why the column was not there already. It was modified a long time ago (2-3 years I think) and I wonder if the fix is to add the column or remove the instructions using it.
Here is my new proposal to manage languages:
It will be managed by ExternalEntityType class. On the external entity type edit form, there will be a new horizontal tab "Language settings", just below the "Storage" tab. On that tab, the user will be able to select a language and it will display current storage configuration form with language specific overrides. In other words, we would have a base storage config (data aggregator config) with possible overrides by languages. For instance, for a REST client, the REST service URLs could be overridden according to the language; for TSV clients, the source file could be changed or a filter could be added; for SQL clients, the queries could be adapted. Since it's the aggregator config that is overridden, it would also be possible to even change the storage client (ie. provide translations from another storage client). The idea would be to provide a checkbox to allow a "language" storage config override and highlight what is overridden (on client side through Javascript). Unchecking+re-checking the checkbox would reset "language" storage config for update. It will be possible to improve this user interface later.
On the run time, the config to load according to current language would be selected by the external entity type when ExternalEntityType::getDataAggregatorConfig() is called.
No change is required to storage clients (including base). No change is required to ExternalEntityStorage.
Modified: ExternalEntityType, ExternalEntityTypeForm and external entity type config schema (...and ExternalEntity for ->setTranslatable(TRUE)
;-) ).
That could be enough but we could get a step further to simplify user config management: we can take profit of the Token module recently added as a dependency to use it in storage clients where it could be pertinent. For instance, a REST service URL could use the token [language:langcode]
and may not need to override storage config for any language.
What do you think?
While this approach works, after thinking, I'm not sure it's the best approach to handle multiple languages. See #3476326 for discussion. I think we should not aggregate multiple languages but rather discriminate which client (config?) to use according to a requested language.
With #3506455, we can (I did not succeed yet though...) merge several language sources into a single external entity with sub-arrays keyed by __external_entity_translation__<LANG_CODE>
.
For instance, example 1:
[
'id' => 42,
'title' => 'English title',
'somefield' => 'other text',
'someotherfield' => 501,
'__external_entity_translation__fr' => [
'id' => 42,
'title' => 'Titre français',
'somefield' => 'autre texte',
'someotherfield' => 501,
],
]
@rp7, in your current approach, you expect raw data to be an array keyed by languages.
For instance, example 2:
[
'en' => [
'id' => 42,
'title' => 'English title',
'somefield' => 'other text',
'someotherfield' => 501,
],
'fr' => [
'id' => 42,
'title' => 'Titre français',
'somefield' => 'autre texte',
'someotherfield' => 501,
],
]
What I don't like with example 1 is that untranslated fields can be duplicated (and it could cost a lot of memory, depending what is stored) and what I don't like in example 2 is that some object may be untranslated but have a key corresponding to a language code which may lead to an incorrect "language availability guessing". If you systematically change the raw structure of all external entities to add a language layer, then there would be no "incorrect guessing" but it would add a layer to raw data and make things more complicated.
I agree I don't see better/ideal solutions at the moment, but we should discuss how things could be handled here. From my side, my main concern is not how the final translatable raw structure will look like but rather how that structure will be generated.
For instance, you could consider consuming a REST API where the language is selected in the URL: you would use a vertical aggregator with as many source as languages you support. Each source provides values for a given language.
Now, if you consider a TSV file (Excel-like file) where there is a "lang_code" column telling the language of the record. You could find multiple time the same ID but with a different value for its "lang_code" column. How to aggregate that? Or should we consider each translated record as independent, but resolve the record to use using both its identifier and its lang_code? It can be interesting to do so to only load the appropriate translation.
So far, ideally, I would prefer the system to be able to only load the appropriate translation, without querying multiple sources (or multiple time the same source). For REST clients, it would mean to select the appropriate URL according to the selected language, for TSV clients or an SQL client to filter on a language column, for a file system client maybe to select the directory scheme according to a language code, etc. I think language should be managed by the storage clients because they know how to filter by language. Otherwise, they would always return all the translations and it would not be efficient. Now, how could that be achieved? I believe there should be a language parameter in the storage client ::load()/::loadMultiple() methods but it's an API change (fortunately, we're still in beta) or a service could be used to get the language instead? What if we want to edit and save another language while the current language is different? I need to think more about it but if you have ideas, please share! :)
What I would prefer to get is example 3:
- Query storage client to get entity 42 in "en":
[ 'id' => 42, 'title' => 'English title', 'somefield' => 'other text', 'someotherfield' => 501, ]
- Query storage client to get entity 42 in "fr":
[ 'id' => 42, 'title' => 'Titre français', 'somefield' => 'autre texte', 'someotherfield' => 501, ]
guignonv → created an issue.
I added translation and I can confirm this problem is related to translation: I can reproduce the issue.
I cannot reproduce the problem. I tried with a XnttTSV storage client and the preview works with rendered entity and its title. I suspect there is something else. What storage client did you use?
Regarding the xntt_views module code, it needs many improvements and many things are still to be implemented properly. In this case, we use the default filter class provided by the views module except for string because I wanted to add a 'FUZZY' filter to test a custom filter behavior.
As @notkataoka noticed, the default string filter provided by views works with database strings and uses the 'LIKE' operator and modifies the filter value which is not suitable for other storage back-ends to efficiently manage filtering.
Filters are passed to the storage clients using StorageClientInterface::transliterateDrupalFilters(). That is where filters can be translated to what is supported by the client when possible.
There is not an official/standard list of supported operators, but a basic list is provided in the Drupal doc and in XNTT StorageClientBase::testDrupalFilter():
'=', '!=', '<>', '>', '>=', '<', '<=', 'STARTS_WITH', 'CONTAINS', 'ENDS_WITH', 'IN', 'NOT IN', 'IS NULL', 'IS NOT NULL', 'EXISTS', 'NOT EXISTS', 'BETWEEN' and 'NOT BETWEEN'
We should list what is used by views to add to this list (eg. "NOT STARTS_WITH").
StorageClientBase::testDrupalFilter() is used when "standard" filters are not supported by the storage client, to provide a "degraded" filtering support on results. The current implementation does not support properly negated filters (ie. NOT ...). Some are, some aren't. It should be fixed because we can meet "NOT CONTAINS" for instance. The code needs to be refactored to handle "NOT" + any other filter.
For investigations, the method ::getTableMapping() comes from SqlContentEntityStorage class while XNTT storage implements ContentEntityStorageBase (which is correct).
I'll continue investigating...
While the code does not look bad, I could not reproduce the initial problem. However, with no surprise, I tried with a SQL storage client. I need time to try with another client. I'm also wondering if the problem would still be there with "::opNotContains()" "::opNotStartsWith()" and "::opNotEndsWith". I don't have all my ideas clear as it's late but I think the "final" operator is supposed to be provided by the storage clients themselves, so there should be no need for this patch. I need to review that more seriously...
guignonv → made their first commit to this issue’s fork.
@guignonv: Maybe this is why you're not running into it?
Yes that might be why because I rarely use full renderded entity. But now, I could try to reproduce the problem later to fix it.
Thanks! :)
What I can say is that I had a look to the views code of Search API to create the views plugin. I'm quite sure their code solves this problem in a way or another. So, my advice would be to see what they did... :p
I think it has been fixed in the current dev version (commit #397e51aa1227ddf52271bc26d7584b280275238d), not the last beta one.
Either try the dev version or patch until we release a new beta... :-S
Hi!
Thanks for the patch. For this feature, I had in mind to use features of the token module. But when I implemented that part, the token service was not part of the storage client services and it has recently been added.
Wouldn't it be more powerful to just support token replacement as it would go beyond static values provided by the query as it could include dynamic values (such as current used login for instance, or a user token, etc.)?
So I would prefer to keep using $this->entityTypeManager->getStorage('my_external_entity')->load("1234");
(to only keep a real id as parameter and keep the existing logic) and provide "color=blue" using another more generic mechanism in the single query parameters part (ie. ::getParametersFormDefaultValue('single')). I would see something either using tokens (which could return nothing if not available for instance) or using an event.
@arwillame, could you add a test in RestClientFunctionalTest.php to test your use case and ensure the fix remains stable in the future?
Your use case could be documented in the test to avoid regressions.
OK. I think I understand the problem but I need to think if your approach is the best. There might be a deeper design issue (-probably my bad-). I'm working other projects in parallel so I'll do my best to think about it as soon as I can and come back to you here. No hurry?
Looks good. Just to make sure I understand everything, it means you would get a single entity with no text keys and you will have to map numeric "keys" to fields then, right?
guignonv → made their first commit to this issue’s fork.
@colan, I don't see any recent commit on v2. Are you sure it has been fixed on that version as well? Or was this issue made on the wrong version maybe and the problem did not exist on v2?
Looks good to me. We can merge and see if we got back a similar problem in another use case one day. If it ever happens, we will advise at that time.
guignonv → created an issue.
Merging this will require to release a new beta as the API changed since storage clients now need to have a token service to be compatible.
It means existing plugins need to be updated and have a minimum requirement for this new beta (to create).
I'll update the plugins I manage and check for the one I don't. Meanwhile, you can merge.
Reviewed and merged.
I think I met that problem once. It rings a bell to me. Well, right now, I'm busy on another project but I'll be able to get back on this in 1 or 2 weeks I think/hope.
Merged! Thanks for the implementation. :)
(...) why is it necessary to map to a field that corresponds to a JSON property? Why not allow the file field to directly map to a property that can hold either a single URI or an array of URIs? Is there a specific use case where mapping to a label is beneficial? Additionally, is it possible to modify the approach so that the file field has its own property for direct mapping?
I'm quite sure we can get rid of an extra text or URI field to map a file field. Like you said, I think it is possible to have the file field has its own file URI property. I didn't investigate though, so I can't guarantee it possible. That's why I added to the roadmap for v3 what I put in #2 here. It needs to be investigated, and it is quite complex I think. I was planning to do it in the coming months but not in the coming weeks. Feel free to investigate! :-) But I may not be able to provide extensive support due to my other priority tasks.
Furthermore, I have this parent::addFieldValuesToRawData($field_values,$raw_data,$context); commented out, as I was pretty sure it does nothing for the file field. Is that fine? or should I include it?
I'm not sure. I think I left it in case, one day, the parent needs to do something special that should be applied to every field mapper, for instance triggering an event. But if it is commented because it does nothing right now, it's ok. If one day, the parent does something, then (maybe after a new issue is raised... :) ) we would put back the call to the parent method. Not a big deal.
Code looks good. I also enjoy the use of custom response decoder as it gives an example of using custom response decoders in external entities. I need to test it though... I'll do my best.
I'm wondering though if it could have been achieved using data processors. But I need to test to really understand the problem when facing data and not theory! ;-p
Also, I need to make sure it will work the way "reverse", when you need to save external data. I'm not sure it does... :-s
Looks good so far. Let me know when it's ready to be merged! :)
That was not an easy one. :)
In Drupal, when you use $entity->save() on a new entity, it updates the entity identifier to the new generated value and returns the global constant SAVED_NEW.
With external entities, an entity can come from one or multiple storage clients (sources). When you want to create a new record, a new identifier may be required as input (for example for the "File" storage client) or it can also be optional and generated by the storage client source. To make it more complicated, when you use multiple sources with "foreign keys", you may have to create records in "secondary" sources that may use different identifiers than the one you may provide.
The only way to have it working is to update the new entity identifier after it has been saved on the external sources. This action must be performed by the storage client because it is the one that got the entity, turns it into a raw array and send that array to the remote source. But the remote source must provide back the new identifier! Otherwise there is no generic way to get that new id. For the REST client, we will expect the query used to save the new entry to return the newly saved entity with its new identifier. I've updated the code to have it grab that new identifier and update the corresponding entity identifier. I tested and it appears to work.
What I didn't test yet is if you want to specify an identifier value by yourself. External entities does not allow to provide an identifier on the entity creation form as it follows the default behavior of ContentEntityForm. A work around would be to have another text field "field_specified_id" using the same mapping as the id field and use that field to specify a new id. Not tested though. I wonder what will be the behavior when the raw entity array will be generated: will the "null" value of the id field override the provided value of field_specified_id? I need to check that. There are no rules (or events) at the moment to tell in which order fields should be used to populate a raw entity array. (something to fix one day...)
At the moment, I did not create a corresponding automated test yet. Also, I need to check the other base clients: File storage client and SQL database storage client. And I did not check if the group aggregator handles new id properly when using "foreign keys" (it supposed to as it tricks the entity id for each client).
Meanwhile, I let you test the changes on the issue fork.
Set to major as saving is a core feature and it should work.
It make sense. I'll add a note for a future implementation of that feature from 📌 [META] Roadmap for v3.0 Active (currently under "For v3 beta 5"):
See if xntt_file_field module could rely on an added file field property (computed) rather than on another xntt field for the URI mapping
As those are tightly related.
Merged.
OK, I think it might be just a very stupid thing: I prepared a file data structure that I did not pass to the file create() method...
Could you try this fix and let me know if it works?
By the way, I never tried the file field plugin with data alteration operations! I'm not sure it will work in that case but who knows... :-p
From my perspective, you can not alter an external file you got from a URI but you can alter the URI you got. So, the code that might be missing is when saving data, the file field plugin should get the new file URI and report it to the text field used to map the URI. But I'm not sure it will be so easy. What if the URI field as also altered manually? Who wins? The value entered by the user or the one provided by the new file? Maybe, the behavior should be set on the file field mapper plugin settings? --> a new feature request? :-)
And for information (to people wondering how that works and did not find where it is explained), most of it is explained in this README.md. It could be improved though. Let me know what could be clarified.
guignonv → made their first commit to this issue’s fork.
Same issue on 3.0.x. Tested patch and merged on 3.0.x. Thanks! I let v2 maintainers apply it to v2.
- Could you provide a way to reproduce your problem on a fresh install?
- What storage client(s) do you use? Is it set to "read only"?
- Did you system use to work before or has it been like that all the time? (When did that change?)
- May your storage source be unavailable from times to times? (leading to external entities not being loaded)
- Do you use translation? (Content index: "entity:discourse_post/8:und" makes me think you do not. I would not see a direct reason explaining your issue but language management in xntt is a thing that recently changed)
It appears to be related to the way Search API works. A deep understanding of how it works is needed. For instance, I don't know if, when you access and indexed item, Search API tries to update it from external content and if that content is not available (even momentarily), it would clear the item. It's a lot of investigation work and I don't have much time so you'll need to provide me as much information as you can on how your system works.
If it is too complicated to solve, I would consider using the Xntt Manager → instead of search API in this case. It can be used to synchronize (ie. similar to Search API indexing) local Drupal content entity with external entities content. You can do it manually or using a cron to do it periodically. Then, you can use those "synchronized" content entity as you wish. If you synchronized to a node type content, you can use all features available on nodes. It could be a simple alternative (which can be deployed in parallel for testing). Keep me tuned.
Thanks! :) Updated.
Looks good to me. Yes, we can merge. I agree it should be a core requirement as it is very useful everywhere. A new beta would be needed indeed. The roadmap should be updated accordingly.
I'll have to check the storage plugins I manage to have them updated when needed (if they override the base constructor) but I'll need that new xntt beta release before.
@colan feel free to merge when you thinks it is ok.
Looks good to me. But it means we have to add the (core) token module as a dependency. I fixed tests accordingly. They pass on my environment but not on GitLab pipeline, probably because of the new dependency? Weird.
The fix is quite simple. Thanks! Merged.
I hope it is not a high priority for you nor urgent as I don't have time currently to review this feature request... :s
I'm not against the idea of splitting Rest client functions into pieces though. I would need time to read carefully what you put and understand what is your final goal and then I could have an opinion.
Merged.
Note: the second query should be corrected as it uses "de" instead of "fr". ;)
https://www.dummy.com/fr/jsonapi/node/publication?filter[id]=d4133e27-92a8-40c9-9b57-122494e444c5
I'm not sure as I am not currently dealing with multiple languages so I can' tell (it's far in my stack but it's there).
I'll let @das-peter answer if he passes by, but to give more info in case he does not, this patch allows to aggregate multiple language version of a content into one external entity instance.
So, let's say you use a REST client that gather content from a website and that content site provides 2 endpoints, one for English and one for French. Each content item has the same id on both endpoint but its content is in different language for each endpoint. You use the vertical aggregator with 2 REST clients, one for each endpoint, and you aggregate them using the new option "as a translation" for the second client. Then you are supposed to have a drop-down letting you choose the language corresponding to the client endpoint and it will add the translated content as a sub-field of the external entity instance. Your xntt array would look something like that, more a less:
[
'id' => 1234,
'title' => 'Some title',
'content' => 'Some content in English.',
'__external_entity_translation__fr' => [
'id' => 1234,
'title' => 'Un titre',
'content' => 'Du contenu en français.',
],
];
...that's what I understood so far.
Merged.
PHP Unit tests fixed now.
C.I. has been enabled. However, there are 2 errors in the "testHorizontalAggregation" test. I checked the logic of test and it appears ok to me so there is a bug in the horizontal aggregator client I think. :-( I think it used to work so it might be a real regression.
Also, there is a couple of validation warnings and many of them could be easily fixed. If somebody volunteers, go ahead for a PR! :-D
I mark it fixed to cleanup the issue queue. Feel free to reopen if not.
I support your approach. I don't have time to test such a thing right now but I'll try to next week. I reviewed the code though and it looks very good!
Fair enough! :) Merged! (I got time to update the repo now)
I support the idea of using tokens and there may be other places where it could be useful as well.
But it means we will need the token module as a requirement right? (I didn't play with the token module in my modules since Drupal 7!) Not a big deal though...
Merged. Thanks!
Reviewed code. Look good and fair. Could you update the repository so I can merge it thanks? :)
Weird because I'm convinced I had increased URL length 2048 a long time ago! But it appears not. :) Thanks for pointing that out and I prefer to use the largest limit 2048 to be safe. I doubt people would have thousands of external entity types in their databases and would need a shorted field length to save database space! ,:-)
I also added the URL for counts for consistency.
guignonv → made their first commit to this issue’s fork.
Hi!
I reviewed the changes. Looks quite good so far and I agree on the change to get mappable fields from the xntt. It makes sens.
However, I less confident in this part you removed:
$derived_entity_type = $this
->getExternalEntityType()
->getDerivedEntityType();
if (empty($derived_entity_type)) {
$this->logger->warning('FieldMapperBase::getFieldDefinition(): Cannot get external entity derived type.');
return NULL;
}
This is used by annotation entities. I did not make automated tests on these so I'm not sure it will work properly. I believe you removed that part also because it was not working with your use case right?
What about keeping that part but commented then?
/*
@todo The following code as been disabled and impacts on annotation nodes should be checked before removing it permanently.
@code
$derived_entity_type = $this
->getExternalEntityType()
->getDerivedEntityType();
if (empty($derived_entity_type)) {
$this->logger->warning('FieldMapperBase::getFieldDefinition(): Cannot get external entity derived type.');
return NULL;
}
@endcode
*/
...or something similar.
Thanks!
Looks fine to me. I just reviewed the code and did not test it yet but I believe it would work.
I'll try to contact him as well and keep you tuned.
I've change the title and the description because we already have a set of tests but they are incomplete.
It might be still relevant for v2 but not anymore for v3. v3 Has also debug features to get better backend error support. Since it's a very old issue, I close it.
On v3, this is currently handled for a single custom header (ie. "Custom header" option in the "Authentication" section). Do you need more than one header item or could some of them be passed to the other section "HTTP protocol > Additional HTTP headers"?
Should an event be added to support more header customization?
Sounds good! Especially if I got some more help on v3! :-D
Regarding the module maintenance, I've asked
Attiks →
to get access and work on a v3. You should ask him to get access too.
Some of the issues you listed are already solved, some others aren't. When I started v3, it was because I needed many changes in v2 and the current maintainers could not handle those. So I got access to work on a separated v3 and left v2 to its current maintainers, to not break anything. That's why there are currently a v2 and a v3. Now v3 is quite mature, it might be time to discuss of the future of v2 with its current maintainers. Do they still need it or can they migrate to v3 while there are many breaking changes in the architecture...
I didn't read everything in details (not enough time to be honest) but I leave this post as a note for people who might want to do something for that issue.
There is something that has not been implemented yet in the external entity API for entity queries: in StorageClientBase::testDrupalConditions(), there is currently no (or limited) support for complex filtering on fields using syntax with sub-fields (like it is the case here with entity references) or with '%delta' (eg. ->condition('tags.%delta.value', 'news'))
). That might be the reason why it does not behave like other Drupal entities with references.
I don't thinks this is still an issue on v3 (I got a 404 in such case). Could you confirm it is solved or give steps to reproduce the error? Do you use annotations, cache?
That's already supported in v3. It's documented on the UI under the "Endpoint options" section. I move the issue back to v2 so their maintainers will decide to close or support.
This feature is already partially supported by v3. V3 supports token authentication in the HTTP header as well as in the query string. However, there's no plugin system for more sophisticated ways of authentication.
It could be a nice feature though. I would see either a plugin that would provide automatic authentication for current Drupal user (would allow SSO but it could also use some user custom account settings) or it could be achieved through an authentication event thrown for REST queries that needs authentication. It might add additional REST settings to specify what requires authentication (all or only PUT/UPDATE/POST, or some calls,...).
This is already supported by v3. I move it back to v2. Either v2 maintainers decide to close it or do the support.
Added Sanity plugin and vertical/horizontal aggregators
Updated to clarify current status with views support in v3.
The views integration is done on v3 through a "companion" module (ie. sub-module provided with external entities that needs to be enabled) but needs improvements. So far, I use the view support of v3 on a production site and the only major problem I remember was column sorting not available on table display. But I don't have a good memory... I think there are also problems with multi-lingual support which is not complete in v3 yet (as you saw in other issues ;) ...there is still a PR that needs to be reworked/updated).
Regarding the documentation, it refers to v2 and it should be updated to tell that v3 does support views. In fact, it IS possible. I got inspired by the Search API module → . But still, keep in mind that it *may* not be efficient with REST services... while it can be relevant with other storage plugins (like external files or databases).
The separate module you saw is the one I did for v2, that I'm not really supporting it anymore (minor fixes) since I've integrated it to v3 (ie. supported in v3, no double work).
It appears I'm the only one who worked on that part and I would be grateful is I could get some help! I'm working on many things (and it's an understatement! :) ) and I have to prioritize my developments. Improving the views support is in my list but in the middle/bottom. {:-)
avpaderno → credited guignonv → .
guignonv → created an issue.
Any news on that one? Should I mark it fixed?
Noted! Thank you very much for your time! :)