Overview
I have noticed a behaviour of the Entity system of Drupal 8 that is rather confusing. When an entity (e.g. node1) is saved using Entity:save() it is evicted by both the static and persistent cache. Other entities (e.g. node2) that reference that specific entity through an Entity reference field and that have been already loaded will hold a reference to the entity object that has just been evicted from the cache. This leads to an incoherent state in which any loads of the referencing entities (node2) will keep on referencing the entity that has been already evicted. The incoherence manifests if we load the referenced entity again and try to modify it. Any changes made to the referenced entity (node1) from then onwards will not be accessible by the referencing entity (node2) unless we evict the referencing entity from the cache.
The problem is better illustrated by the following example.
Example
Model:
Type1:
Type2:
- field_ref_type1: Entity reference to type1
Data:
Nodes:
Node(id=1) of type1
Node(id=2) of type2 with field_ref_type1 referencing Node(id=1)
Code
// type1 node
$n1 = \Drupal::entityManager()->getStorage('node')->load(1);
// type2 node
$n2 = \Drupal::entityManager()->getStorage('node')->load(2);
$n1->field_integer1 = 100;
print("field_integer1 through n1: {$n1->field_integer1->value}\n");
print("field_integer1 through n2: {$n2->field_ref_type1->entity->field_integer1->value}\n");
print('n1 addr: ' . spl_object_hash($n1)."\n");
print('n2 addr: ' . spl_object_hash($n2->field_ref_type1->entity)."\n");
$n1->save();
$n1 = \Drupal::entityManager()->getStorage('node')->load(1);
$n1->field_integer1 = 200;
print("field_integer1 through n1: {$n1->field_integer1->value}\n");
print("field_integer1 through n2: {$n2->field_ref_type1->entity->field_integer1->value}\n");
print('n1 addr: ' . spl_object_hash($n1)."\n");
print('n2 addr: ' . spl_object_hash($n2->field_ref_type1->entity)."\n");
$n2 = \Drupal::entityManager()->getStorage('node')->load(2);
print("field_integer1 through n1: {$n1->field_integer1->value}\n");
print("field_integer1 through n2: {$n2->field_ref_type1->entity->field_integer1->value}\n");
print('n1 addr: ' . spl_object_hash($n1)."\n");
print('n2 addr: ' . spl_object_hash($n2->field_ref_type1->entity)."\n");
Output
field_integer1 through n1: 100
field_integer1 through n2: 100
n1 addr: 000000004ca1b1840000000013d89c9c
n2 addr: 000000004ca1b1840000000013d89c9c
field_integer1 through n1: 200
field_integer1 through n2: 100
n1 addr: 000000004ca1bcac0000000013d89c9c
n2 addr: 000000004ca1b1840000000013d89c9c
field_integer1 through n1: 200
field_integer1 through n2: 100
n1 addr: 000000004ca1bcac0000000013d89c9c
n2 addr: 000000004ca1b1840000000013d89c9c
Observations
My issue with the above behaviour is that it is not clear how should somebody load an entity utilising the entity cache, whenever possible, without having the above coherency issues.
Just let me note here that if the above synthetic scenario seems far fetched think of a reasonably complex backend. Each function of the backend has to take several entities as arguments. Somebody might opt for passing arguments as entity ids or entity objects. In the case of the former what the above example demonstrates is that unless somebody clears the cache before loading any entity it is not possible to write a function that will always read the latest value of an entity field. This is because loading an entity by its id might lead to incoherences depending on the code that has been executed before. On the other hand in the case of passing arguments like entity objects there will be a function in the calling hierarchy that has to load the objects for the first time by their id (e.g. passed as an entity id from some http request). Therefore the latter case is "reduced" to the first one.
Proposal
To my knowledge this behaviour deviates from common practice in other Object/Relational Mapping frameworks
(e.g. hibernate in Java) where within a specific context (session) it is guaranteed that entities are accessed through some sort of proxy object whose id->address mapping is constant. This way any entity references are guaranteed to be valid within that context. To my understanding, such a context is not clearly defined in the case of the Drupal 8 Entity API. The underlying issue is most probably due to the requirement that Entity::save evicts the entity object from the cache. I am not aware of the design issues that impose this requirement but if Entity::save were not evicting an entity from the cache this incoherence issue would solve this issue.
Request
Could somebody kindly confirm that the behaviour demonstrated in the example above is as expected. If yes, but somehow the example does not follow the Entity API paradigm could you please indicate any ways to avoid the issue without compromising code modularity?