- 🇫🇷France guignonv
It's a 2-years old topic but I'd like to add some comments and update for other people that would see this topic later.
First a couple of updates regarding External Entities → module:
- More than two "types" of connections exist by now:
- REST (native) with its derivative Wiki endpoint (native) and a new specialized endpoint for BrAPI → (breeding API) that shows the REST base is generic enough to build other clients
- Databases → : supports both local and external PostgreSQL and MySQL database types so far. Works with custom "raw" SQL queries. Supports CRUD and filtering. There is a derived client for a specialized database schema (for biological research) called Chado Light → with a more user-friendly interface that should it's possible to build on the base "database" client to bring user-friendly interfaces to handle specific database schemas.
- Files → : supports both all records in one file, each record in separate files, a mix of both and files as data record themselves (ie. an image file for instance). There is a set of derived client to support: CSV/TSV → , JSON → , XML → and YAML → .
- a "type mixer", xnttmulti → , that enables the mix of several sources (any combination of the above) into one. Sources can be filtered and merged by groups or accumulated (ie. a given external entity id can gather data from 2 sources, adding or overriding fields, or 2 sources can also provide the same type of entities that are listed one after the other). It also supports join field between sources (ie. merge data from a source to another using a field of the first source as a key to fetch the other source item, rather than using only one identifier key for all sources).
- There is also a extension module called xnttmanager → that enables automatic "annotation entities" creation as well as data synchronization and/or caching. It's also able to list available external fields (mapped or not) as long as the ID field is mapped (but this limitation may be removed soon) and highlights good and invalid mapping. The ability to automatically generates a corresponding Drupal field structure for an external entity is a feature request on its way.
- For data mapping, there are now field mapper plugins. 2 are built-in in external entities module: simple and JSON Path. Another plugin exists to handle the use of expression that allow some PHP string functions: xnttstrjp → .
- There is a (currently alpha) plugin to make external entities work with Views natively: xnttviews → . It's more relevant to use it with local sources such as databases or files rather than REST sources for obvious performance reasons.
- Another issue that was not mentioned is that it was not possible to map file/image Drupal fields to external content. It was a pity because there are many plugins that work with 'file' or 'image' field types that could not be used with external entities. This problem is now solved with the last (currently 'dev') version of the xnttfiles → module. It's now possible to map file or image fields to external content without falling back to a link field with external image cache module as it used to be.
Regarding performances, it depends what you want to achieve but there are several solution already available. You can use local Drupal cache, data replication to local Drupal entities (not external) with automated synchronization (using xnttmanager → module) and you can also index external entity content with modules of the search API → eco-system.
Note: in the case of database or file sources, the performance should be there without all that stuff since we're working on local data.So, I see here above a list of features that should be kept in case of a "redefinition" of what Drupal external entities should be.
A last note: most of those modules mentioned above are in development stage but they already fulfill many needs and are promising. It is, for instance, possible to have a (old) commerce site with it's own database and build on a side a Drupal site that access that database (even while it's live) and that can (later) take over the database. It's also possible to use the source mixing or the xnttmanager to load data from an external source and convert/store it into another source (ie. read data from a file and save into database or vice-versa with xnttmulti or duplicate external data into Drupal as Drupal "regular" content entities through xnttmanager). There are so many possibilities and use cases that can be fulfilled that it's hard to list them all here!
- More than two "types" of connections exist by now:
- 🇫🇷France guignonv
Additional comment: I forgot to emphasis one important thing that makes a big difference between External Entities → and other modules like External Data Source → (or Tripal → for biological data): it is entity-based and not field-based.
Why is it important to me? Because I need to create hybrid entities made from several sources. For instance, I have some parts of my data stored in a database and some other parts in files and I can also aggregate some other information from REST services (for an example, I have "germplasm" -a kind of specific organism/plant genetic profile- data in a database but I aggregate some processing status from a flat TSV file used by another external application and I also need to aggregate data from another partner site to know if that germplasm has been used in some experiments). If my data was loaded on a field basis, it would mean each data source would be queried separately for each field. So, if I have 20 database fields, 4 file fields ans 10 REST fields, I would have 20 database queries, 4 file system requests and 10 (web) REST queries! It would not be efficient. With the External Entity approach and the xnttmulti module, I can query just one time each source and then map the data to my Drupal fields which I feel is more efficient. That's a key point: the less external sources are queried the better it is.