Refactor module architecture in a simpler, opinionated and more performant approach

Comments & Activities

Not all content is available!

It's likely this issue predates Contrib.social: some issue and comment data are missing.

Comment almost 2 years ago →
🇺🇸United States partdigital
One suggested approach that we've been using on our project to handle entity usage:

We created a service that accepts a top entity along with a specification. It would then traverse through the tree and only store the results that we needed based on that specification. As we traversed the tree we would also store the location of each item so that once the usage was captured we could easily traverse that set with methods like getParent(), getChild(), getSibling() etc.

For example, our API looks like this:

We define a specification. It's basically just an array but it could be made into a plugin/config entity and given a name. So that you could define meaningful traversal specifications for your project. You can also simply generate a "default" specification by observing what fields and entity types there are on the site. Though I've usually found it more useful to be more explicit somehow.

$spec = [ 'page' => [ 'entity_type' => 'node', 'bundle' => 'page', 'fields' => [ 'field_entity_reference' => [], ], ], 'article' => [ 'entity_type' => 'node', 'bundle' => 'article', 'fields' => ['field_entity_reference' => []], ], ];
We then pass that specification into a method.

$collection = $service->getReferencedEntitiesCollection($parentEntity, $spec);
Now we can do things like this:

// Get first level of children. $collection->getChildren(); // Get all children recursively $collection->getAllChildren(); // Get the immediate parent if the child is known. $collection->getParent($entity); // Get all the parents of a known child. $collection->getAllParents($entity); // Get all siblings $collection->getSiblings($parent, $entity);
This is very fast because we store the entity id and its location in the set (basically an index). See the example below. The key is the location and the value is the entity id.

[ '0' => 6 '0:0' => 4 '0:1' => 3 '1' => 5 '1:0' => 3 '1:1' => 2 '1:1:0' => 1 '2' => 8 '3' => 9 '4' => 10 ];
To get this working with the broader entity usage, you could:

Cache/store each set for each top entity.

Create a relationship between the top entity and each child entity so that it's easy to find the set. So in the example above you'd have 10 records.

The api might look like this:

// Finds the top entity and all its sets with this child. Let's say it's just one collection. $collection = $service->findCollection($childEntity); // This then gets its immediate parent (not the top parent) $collection->getParent($childEntity);
Just food for thought as you're working on this :)
Comment over 1 year ago →
🇦🇺Australia acbramley
Is the plan to still go ahead with this 3.x branch? I see there's now a 4.x branch using entity_track. Surely we should consolidate efforts on a single new architecture?

We use this module pretty heavily on one of our client projects and they've recently asked for features such as filtering the Usage list by current/previous revision so I'm happy to help the efforts in order to unlock so of those more complex features.
Comment over 1 year ago →
🇪🇸Spain marcoscano Barcelona, Spain
Thanks all who have been providing feedback and ideas to this issue. Apologies for not replying earlier 🙏

@acbramley thanks! I will take any help available :)

Currently I would say that both 3.x and 4.x branch are very much experimental and shouldn't be used on prod. Development on 3.x stalled at some point because I didn't feel good being the only one moving this idea forward (being this such a disruptive architectural change). Then at some point in time @seanb and @askibinski came up with the idea of splitting the API into a generic layer to "track things", and then make Entity Usage just be a consumer of that API, which makes sense to me, but we didn't fully make the switch into this new 4.x branch, and the development kind of stalled.

Yes, I think at this point it makes sense to envision the refactoring mentioned here on top of the 4.x branch. In order for that to happen, I would say that a rough roadmap could be:

ET = Entity Track
EU = Entity Usage

0- [NEEDS WORK] Fix tests in D10 / Switch to GitlabCI 📌 Fix tests in HEAD for D10 Active
1- [ALMOST DONE ?] review the current code / update the branches with latest commits on EU (entity_usage) 2.x and ensure we have feature parity between ET 1.x + EU 4.x and EU 2.x
- This was kind of OK as of Dec 2022 with #3324787: Update 4.x branch → and #3324797: Update with entity_usage 2.x changes → but we'd need to review latest bug-fixes since then.
2- [NEEDS REVIEW] ensure that the test coverage of ET 1.x and EU 4.x combined is equivalent of what we have in EU 2.x
- This probably happened as part of the above issues as well, but we'd need to double-check we are not losing test coverage in the switch
3- [NEEDS WORK] ensure we have an upgrade path for existing users on EU 2.x #3326110: Create an upgrade path for EU 2.x -> ET 1.x + EU 4.x →
4- [DONE ?] have some real world experience / feedback of ET 1.x + EU 4.x
- I know of one reasonably-sized project that is using ET+EU on prod for a couple years now, but it would be great to get more alpha testers out there if we can.

After this, I believe we could tag a EU 4.0.0-beta1 and mark it as recommended branch instead of 2.x.

Then, it would likely make sense to revisit the refactor from this issue and simplify everything in a 5.x branch probably?

I am OK going forward with this plan and welcome everyone that is able/willing to participate.

Thanks!
Comment 3 months ago →
🇷🇴Romania claudiu.cristea Arad 🇷🇴
As there's no move here, and because I badly needed something like the refactoring has envisioned, I had to create a new module which is more or less based on this idea. Enter Track Usages → .

Here are some key differences:

Track Usages → only registers relations between source (top-level) entities and target entities. Traversable (middle-level) entities are traversed but not registered as distinct records/rows. This allows the database to scale, as with Entity Usage, very often the usage table becomes unmaintainable.

The configuration is stored in config entities, meaning you can record usages for different scenarios. For instance, you may want to track the relation between nodes and their files, but for a different scope, you may want to track record the relation between node and taxonomy terms. For each scope, you will define a different configuration.

This module doesn't offer any UI to end users, its only business is to offer the usages

Posted this comment for those who might be interested.

PS: I needed this kind of functionality to achieve the scope of File Visibility → module

Refactor module architecture in a simpler, opinionated and more performant approach

Problem/Motivation

Proposed resolution

Remaining tasks

User interface changes and modifications on default behavior

API changes

Comments & Activities