- Issue created by @john.oltman
- Merge request !11887Issue #3520065: The migrate Row class API is incomplete β (Closed) created by john.oltman
- πΊπΈUnited States john.oltman
john.oltman β changed the visibility of the branch 3520065-migrate-row-remove-empty to hidden.
- Merge request !11888Issue #3520065: The migrate Row class API is incomplete β (Open) created by john.oltman
- πΊπΈUnited States smustgrave
Small comments. Tagging for sub-maintainer sign off for API changes.
- πΊπΈUnited States benjifisher Boston area
Thanks for suggesting this addition to the Migrate API. I think this is more of a feature request than a bug report, so I am updating the category.
I am trying to understand the use case for this addition. Is it mostly for migrations that update existing entities? If a migration is creating a new entity, then I think it does not matter whether the
Row
object has a null property or is just missing that property.I also wonder whether we need all four of the new methods. Are
hasDestinationOrEmptyProperty()
andremoveDestinationOrEmptyProperty()
really useful? Again, what is the use case?For both of those questions, it would help to see a code snippet of how you are using (or plan to use) the new methods.
- πΊπΈUnited States john.oltman
Yes, typically an update migration. I have a client with a business rule that states that under certain circumstances, data in an important field should not be altered or removed by a migration import, even if it otherwise would be. This is impossible without patching the Row class. Simply put, the class has two protected variables that work in tandem, destination properties and empty properties. The destination has set, get and remove. The empty only has set and get. It needs remove, just like its sibling. Call it a feature request if you will, but it seems to be an oversight.
Regarding `hasDestinationOrEmptyProperty` and `removeDestinationOrEmptyProperty`, those are convenience methods - strictly speaking they are not needed, but are very useful when trying to implement the "prevent alteration of field in any way" use case. We might as well add them while we are in there.
- πΊπΈUnited States john.oltman
Code snippet in an event subscriber:
public function onPreRowSave(MigratePreRowSaveEvent $event): void { $row = $event->getRow(); if ($field_name = $this->complexBusinessLogicThatDeterminesWhichFieldNotToAlter($row)) { if ($row->hasDestinationOrEmptyProperty($field_name)) { $row->removeDestinationOrEmptyProperty($field_name); } } } public static function getSubscribedEvents(): array { return [ MigrateEvents::PRE_ROW_SAVE => ['onPreRowSave'], ]; }
- πΊπΈUnited States benjifisher Boston area
Thanks for the feedback.
I think we should not implement the two convenience methods. There are several reasons:
The
has...()
convenience method callshasDestinationProperty()
and (if that returnsFALSE
)hasEmptyDestinationProperty()
. If the code next calls theremove...
convenience method, as in your snippet, it calls the samehas...
method(s). That seems sloppy.The method names are long, and they do not adequately describe what the methods do. Anyone reading the code will have to look up what the method does. It is convenient to write, but not to read.
Adding extra methods means we have to add and maintain test coverage for them.
In simple cases with content entities, you can get the same effect by adding
overwrite_properties
to the configuration of the destination plugin. (SeeDrupal\migrate\Plugin\migrate\destination\EntityContentBase
.) I think this covers the most common use cases. I do not want to add convenience methods to the API if they are only going to be used by occasional projects that have especially complex business logic. Those projects can add convenience functions (not methods on theRow
class) themselves. - πΊπΈUnited States benjifisher Boston area
On second thought, I do not think we need to add
hasEmptyDestinationProperty()
.First reason: it is safe to call
removeEmptyDestinationProperty()
without first testinghasEmptyDestinationProperty()
.Second reason: if code does need to test, it can just use
in_array($field_name, $row->getEmptyDestinationProperties())
.Source and destination properties need the
has...
methods because they use theNestedArray
methods, but$emptyDestinationProperties
is just a flat list of strings. - πΊπΈUnited States john.oltman
Thanks for your review, this is ready for another look.
I think we should not implement the two convenience methods.
Good points, I removed those two methods.
This is one more line, but it is fewer characters.
That is a nice optimization, I applied your suggestion to the MR.
On second thought, I do not think we need to add hasEmptyDestinationProperty().
I kept this method in for these reasons:
* It causes no harm
* It provides symmetry with the destination "has" method, and thus contributes to the ability to understand the Row class
* It provides information hiding of the internal implementation, which is an accepted principal of good API designI can take it out if required, but would prefer to keep it if you are amenable on this point.
- πΊπΈUnited States benjifisher Boston area
Thanks for the updates.
I think we can compromise and keep the
has...
method. After all, it is handy to have it in the tests.In addition to the unit test, I would like to add a functional test. Basically, it comes down to the question I asked in Comment #12: I would like an example in a test (as well as an answer here on the issue) to "What is the use case?"
I think the functional test needs a test module with two migrations. Each migration can use the
embedded_data
source plugin. The first migration can populate two fields, and the second migration can set both fields to empty. (To be honest, I am not sure what it takes to make that happen. Maybe justfoo: null
in the embedded data andfield_foo: foo
in theprocess
section?) Then the module should implement an event listener, like the snippet in Comment #15 and remove just one of the two empty destinations.Then the test can run the first migration to create an entity, then run the second migration to update the entity. After all that, assert that one of the two fields is empty and the other is not.
If I got any of that wrong, then I mis-understood the answer in Comments #13, #15. In that case, it proves that we need the functional test.
- πΊπΈUnited States john.oltman
Thank you, I'd be happy to create the functional test you describe. And your understanding is perfect. I'll hop on this and get something posted within the next week.
- πΊπΈUnited States john.oltman
This worked out nicely. I added the test as a Kernel test since a browser is not needed and kernel use fewer GitLab resources and run much faster than functional. I named the test class generically as RowTest in case we or future maintainers want to add more tests for other methods of the class. For now the existing test validates the destination methods with a focus on the new methods. Back to ready for review.
- πΊπΈUnited States benjifisher Boston area
Thanks for adding the test, and +1 for making it a Kernel test. I have just a few suggestions on the MR.
- πΊπΈUnited States john.oltman
Good suggestions, I applied all of them. Feels lean and mean, see what you think.
- πΊπΈUnited States benjifisher Boston area
Looks great! Thanks for putting up with all my requests for tweaks.
I sometimes have to remind myself that the T in RTBC stands for "tested".
For testing purposes, I installed Drupal with the Umami demo profile (and sample content). I then hacked the
log
process plugin, hard-coding a field name:public function transform($value, MigrateExecutableInterface $migrate_executable, Row $row, $destination_property) { // Hack for testing purposes. if (empty($row->getSourceProperty('field_preparation_time'))) { $row->removeEmptyDestinationProperty('field_preparation_time'); } // ... }
I created a test module with the following migration:
id: empty_destination_test migration_tags: - Test label: 'Test removeEmptyDestinationProperty()' source: plugin: embedded_data ids: nid: type: integer data_rows: - nid: 9 field_preparation_time: null field_number_of_servings: null process: field_preparation_time: field_preparation_time field_number_of_servings: field_number_of_servings nid: - plugin: log source: nid destination: plugin: entity:node default_bundle: recipe migration_dependencies: {}
It calls
log
after the two numeric fields have been processed: the order matters!As expected, when I executed the migration (
drush mim empty_destination_test
),field_number_of_servings
was emptied out andfield_preparation_time
was untouched:MariaDB [db]> SELECT entity_id, field_preparation_time_value -> FROM node__field_preparation_time -> WHERE entity_id = 9; +-----------+------------------------------+ | entity_id | field_preparation_time_value | +-----------+------------------------------+ | 9 | 10 | +-----------+------------------------------+ 1 row in set (0.001 sec) MariaDB [db]> SELECT entity_id, field_number_of_servings_value -> FROM node__field_number_of_servings -> WHERE entity_id = 9; Empty set (0.001 sec)