[PP-1] Enforce conflict enforcement outside of tests and e2e tests

Issue created by @tedbow
Comment about 2 months ago →
🇺🇸United States tedbow Ithaca, NY, USA
Comment about 2 months ago →
🇺🇸United States tedbow Ithaca, NY, USA
Merge request !1080Resolve #3526907 Client auto-save conflicts → (Closed) created by tedbow
Comment about 2 months ago →
🇺🇸United States tedbow Ithaca, NY, USA
I brought in the changes from 📌 Add rudimentary conflict prevention to the preview end-point Active because I made progress getting the e2e test to pass, so I thought I would the current state
Comment about 2 months ago →
🇺🇸United States tedbow Ithaca, NY, USA
The only 2 e2e tests that are failing now are `entity-form-field-types-test.cy.js` which does pass for me locally and `media-library.cy.js` which did fail locally but I didn't see an 409 request errors in the log. So maybe these are unrelated random test errors? Not sure, retesting
Comment about 1 month ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Comment about 1 month ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
AFAICT this is also a beta blocker? Otherwise we haven't solved the current auto-save-data-loss scenarios described in 📌 Add rudimentary conflict prevention to the preview end-point Active ?
Comment about 1 month ago →
🇮🇳India nagwani
Comment about 1 month ago →
🇺🇸United States tedbow Ithaca, NY, USA
I am actually surprised that no existing e2e tests failed 🎉

2 new tests failed

rapid-component-addition.cy.js: I am not sure this can really be test reliability give the random test fails we have in e2e and this test would solely rely on timing. I am removing it

auto-save-conflict.cy.js: This was attempt to try to simulate 2 users with 2 different browser sessions having a conflict. I am not sure how to do this in cypress and searched around a little and couldn't get it to work. Someone with more experience might have better luck
Comment about 1 month ago →
🇬🇧United Kingdom jessebaker
I am having a hard time coming up with a robust solution on the client side.

There are two different issues.

The first issue is concurrent editing from two different client instances. That can be either two different users or the same user in two different tabs.

This is solved by the autoSaves hash being passed back and forth and checked and a warning being issued in the instance that a conflict/mismatch arises. ✅

The second issue is proving more difficult to tackle. This issue occurs when a single user in a single client is able to trigger the mismatch scenario by making multiple requests in quick succession.

The issue happens in the following scenario:

The client sends an update to the server with the current autoSaves hash (we’ll call it 1).
While that request is still in flight, the client sends another update to the server. This request also has autoSave hash 1 because the first request has not returned yet with the updated hash.
The server receives the first request and generates a new autoSave hash (2)
The response to the first request comes back to the client and the local autoSave hash is updated to 2.
The server receives the second request (which has autoSave hash 1) and rejects it (because it should be 2!) as 409
The response to the second request is received by the client as 409 and the user is shown an error.

Potential solutions (to problem 2)

“Lock” the UI when a request is in flight so that subsequent requests cannot be sent
Silently discard new requests made when a request is in flight (this is what Claude’s attempt is doing in !MR1080)
Mark in flight requests as "aborted" when a subsequent request is made. When an aborted response returns, discard the returned data knowing that the second response is on the way.
Create a client-side “pending” system. While a request is in flight, subsequent data updates to the layout/model are stored as a local variable. When a response is received the local variable is checked and if the variable is set, generate a new request with the data from the variable combined with the autoSaves hash from the most recent response. Then Immediately unset the variable. If multiple data updates are made before the response comes back, they can each just overwrite the local variable as each change is a cumulative data that includes all previous data updates.
On loading the React app generate a uuid to use as a client instance ID. Alongside the autoSave hash also send the client instance ID to the server. When the server receives consecutive requests with the same client instance ID, bypass the autoSave hash check as we know that they have come from a single instance and thus can’t be conflicting concurrent changes.

I've investigated the above solutions and have the following conclusions:

This has poor UX implications. If we visually indicate the UI is locked it appears flickery on a good connection. If we just silently ignore user actions it makes the app feel unresponsive

This allows the server and client to get out of date. The user makes 3 actions but the 2nd and 3rd are silently discarded. It looks like the actions have been made, but on the server only the 1st actually happened.

I spent a long time implementing this only to find that it doesn't solve the issue. The subsequent request still has the out of data autoSave hash at the point it is dispatched so it still leads to a 409 error.

I have spent even longer attempting to create this system. After 2 days I've not been able to successfully get this working with RTK query. I DO think it's possible, it's just VERY difficult and is resulting in a large amount of additional complexity

I have just had this idea and I'm writing this comment on the issue because I'd like to get a second opinion on if its feasible. I think it would work and is less complex than the pending system above.
Comment about 1 month ago →
🇺🇸United States tedbow Ithaca, NY, USA
I think there is pontential problem with #11 5)

they have come from a single instance and thus can’t be conflicting concurrent changes.

Is this true? What if the client makes 3 request in sequence and for some reason request 1 takes a long time to connect? Basically request 2 and 3 are processed and then request 1 gets processed. Wouldn't that wipe out changes in 2 and 3? Is what I suggesting even possible, maybe on randomly sluggish networks? Maybe ok to handle this edge case?
Comment about 1 month ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
It may be that 📌 Replace the postPreview action with atomic equivalents Active is a better use of effort if the approach here (which was meant to be a short-term quick fix) is not possible.
Comment about 1 month ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
+1 for re-assessing 📌 Replace the postPreview action with atomic equivalents Active . It's now >6 months old, but still relevant.

Marking and leaving assigned to @tedbow to convey we should first check that avenue.
Merge request !1127#3526907 Send timestamp, autoSave hash and unique client id with post and patch... → (Merged) created by jessebaker
Pipeline finished with Failed
about 1 month ago
Total: 1091s
#519668
Comment about 1 month ago →
🇺🇸United States tedbow Ithaca, NY, USA
re #13 and #14 I am currently a -1 on doing 📌 Replace the postPreview action with atomic equivalents Active because

There is nothing in #3492065 for "User interface changes"(I just copied that part of 📌 Add rudimentary conflict prevention to the preview end-point Active to here now)
if we "Enable concurrent editing" there probably will be some very tricky UX problems and we should figure the desired outcomes before we start making the solution
Will concurrent editing just work perfectly and we all agree exactly what that means so there is no describe the UX? 😜

for instance what happens if I am editing props for component, which should show the updates I as type, at the same time another user does another operation that moved that operation at out of view? This would either be by the other user reorder components at the same level as mine or adding another component above it that puts my out of view. Or it could be changing the props of component above mine that changes the height.

Is there no possibility of conflict still in that system? I understand it would be less possible because you are allowing 2 users to edit different parts of the page. But 2 users could still try to edit the same component where 1 user didn't start with the latest changes that were just auto-saved correct?
If so think a lot of the problems documented in #11 still exist just at a different level down

I think #12 out of sync requests from the same user could still be problem. Especially because we send auto-save requests as you type for single prop of single component

So I think you we wouldn't be saving time but not having to solve #11 it would just be a different level, per component

📌 Replace the postPreview action with atomic equivalents Active just seems more complicated so there would be more unknown unknowns. I think we are further along on the current implementation so there would less, less risk
Pipeline finished with Failed
about 1 month ago
Total: 818s
#520552
Comment about 1 month ago →
🇺🇸United States tedbow Ithaca, NY, USA
Just wanted to make clear my comments in #16 were all with the idea that we have limited time before Beta1 - re 🌱 Milestone 1.0.0-beta1: Start creating non-throwaway sites Active and probably have a lot of other issues. The product decision has been made that we don't need concurrent editing in Beta1. Not sure about GA. but I assume we have a lot of other features that were prioritized in for Beta1

If we had more development resources and there weren't other features that were deemed beta blockers I would be all for getting concurrent editing in before beta1. But also if also if we made that decision won't the first order of business be defining the UX for that?
Pipeline finished with Failed
about 1 month ago
Total: 266s
#520790
Comment about 1 month ago →
🇬🇧United Kingdom jessebaker
@tedbow I've found a bug that I think needs to be addressed server side.

Steps

Ensure you have no autosaves/pending changes

Reload the page to ensure you start fresh

Make a change (e.g. edit a component prop)

Make another change. Note that these changes both make requests with the correct/latest autoSave hash in the payload

Publish all changes

Make a third change. Note that this request correctly sends the last autoSave hash that was received in .4

The response to this third change is a 409.

I suspect the latest autoSave hash on the server side is either being deleted or updated when changes are published but not communicated back to the FE so the client ends up with an out of date hash.

I think that either a) publishing should not update the hash, or b) the response from a publish action needs to include the updated hash to the FE can store it for the next request.
Pipeline finished with Failed
about 1 month ago
#521462
Pipeline finished with Failed
about 1 month ago
Total: 973s
#521474
Pipeline finished with Failed
about 1 month ago
Total: 960s
#521521
Pipeline finished with Failed
about 1 month ago
#521738
First commit to issue fork.
Pipeline finished with Success
about 1 month ago
Total: 3570s
#521874
Comment 28 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
@jessebaker I found the problem that is causing #18

It is because during Publish we delete the auto-save entry, including the hashes and the client instance id, for the items that were published. This is because there are no auto-saved information at this point. Everything matches what should have been just saved. We don't want this to show up "Review X changes"

So when the client sends the auto-saved hashes when find a mismatch, client has auto-saved hashes and the server doesn't. Now we could change the server side mechanism to stored this information separate from the actual auto-save information we will need to publish the changes but that would make the server side more complex. Also it is also possible that this particular entity will never be edited within XB again, so this keeps us from having to keep auto-saved hash information for any entity that was ever edited in XB forever.

So I was going to suggest that when submit a publish request and get response from the server that you delete the auto-save hashes. That would solve the problem in #18 but I did make me think of another problem, that actually exists right now

User 1 open xb/xb_page/1/editor, does make any changes, there is no auto-saved hash

User 1 leave their browser open

User 2 opens xb/xb_page/1/editor makes some changes

User 2 publishes xb_page 1

User 1, makes a change in their browser session that they never closed

I think the effect would be that User 1 would make an auto-saved version that would not be aware of the published changes and so would wipe out those changes when it was published again.

We could solve this by not deleting the auto-saved entry the server-side, and therefore throwing a conflict error for User 1. But again this would require use to keep auto-save hash information forever for all entities, which could start to add up on larger sites.

So I don't think this is a solution.

I think what we need to do is have the client be aware of the starting point version of the entity it is editing.

So for example for a page or node this would be version number, vid or changed property.
Basically the server would return to the client, along with the "autosaves" key, a "verison_id" key of the last saved version on the server. The client would always return this key.

So in the example above User 1's last request would properly get a 409 error because it's version number would be 1 version number behind. But this would also require the client to request /xb/api/v0/layout/xb_page/1 after publishing. Though after 🐛 [Meta] Selective Publishing and Reverting Active the client could try to be more selective and not request if the current page was not publish it would probably be safer to always just request it.

There are probably unrelated reasons why the client should always call /xb/api/v0/layout/xb_page/1 after publishing. For example Drupal has hook_entity_presave which allows altering an entity before it is saved. Of course XB will not be aware of any changes made in custom code here, therefore if XB doesn't call /xb/api/v0/layout/xb_page/1 after publishing the auto-save request could wipe out any changes that were made in any hook_entity_presave code
Comment 28 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
I pushed up 86ce2085 which will return the revision id under "auto" . Probably at least autoSaves: Record<string, string>; will need to updated on the front-end because this now an object.

but it seems to work manually testing
Comment 28 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Note I have not had time to fix the phpunit tests but the e2e test passed so that is good sign. Just to need to update the phpunit tests to expect to a nested structure with revision info from the client.
Comment 27 days ago →
🇬🇧United Kingdom jessebaker
@tedbow re: #18 #20

1) I think perhaps we should create/address this in a follow up
2) +1 to using 'changed' or even perhaps 'lastModified' and using a timestamp - I could see that information actually being useful elsewhere in the UI at some point.
3) After publishing I think we already refetch some data so I don't see it being a problem to update the page.
Pipeline finished with Failed
26 days ago
Total: 818s
#524823
Pipeline finished with Failed
26 days ago
#524849
Pipeline finished with Failed
26 days ago
Total: 740s
#524863
Pipeline finished with Canceled
26 days ago
Total: 127s
#524870
Pipeline finished with Failed
26 days ago
Total: 1044s
#524871
Pipeline finished with Failed
26 days ago
Total: 493s
#524903
Pipeline finished with Failed
26 days ago
Total: 928s
#524909
Pipeline finished with Failed
26 days ago
Total: 1110s
#525016
Comment 26 days ago →
🇬🇧United Kingdom jessebaker
There are probably unrelated reasons why the client should always call /xb/api/v0/layout/xb_page/1 after publishing.

#20

As of my latest commit the FE now re-requests the page after publishing changes IF the current page is amongst the list of pending changes.

In my testing there is one remaining bug that I can see that is quite tricky to reproduce.

In a browser tab start from a blank page, no autosaves (everything published)

Copy the URL and open a new tab so you have two tabs with the same blank page state.

In Tab 1, add a Heading component and immediately publish all changes

Switch to Tab 2.

Preview is now out of date (showing blank still, but should have a Heading on it)

Add a Hero component

There should be a conflict warning but there isn't.

From what I can see, the issue is that the autosave hash has autoSaveRevision but it's always "1". Nothing I do ever seems to increment that number (that is actually a string).

{ "autoSaveRevision": "1", "hash": "efa8dd3ce4f5f9e1" }
Comment 26 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
re #24 ok might know what is happening.....
Pipeline finished with Failed
26 days ago
Total: 848s
#525244
Comment 26 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
@jessebaker thanks for testing this. I was testing this with nodes and it was working.

Basically entity have a seperate revision id but if we create a new revision on every save this won't increment. I guess Page is not incrementing which I guess makes sense because we don't have a way to revert to old revisions or view old revisions AFIACT

So I used the "changed" time also in the revision for the client
Pipeline finished with Failed
26 days ago
Total: 997s
#525251
Pipeline finished with Failed
26 days ago
Total: 1060s
#525267
Pipeline finished with Success
26 days ago
Total: 2645s
#525282
Pipeline finished with Failed
26 days ago
Total: 928s
#525377
Comment 26 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
changing title to make clear we are only handling content entities, node and pages with layouts

We will need this too 📌 For config entities add client support conflict detection via the `autoSaves' key Active
Pipeline finished with Failed
26 days ago
Total: 1063s
#525403
Pipeline finished with Failed
26 days ago
Total: 1857s
#525425
Pipeline finished with Failed
26 days ago
Total: 821s
#525450
Pipeline finished with Success
26 days ago
Total: 810s
#525461
Pipeline finished with Failed
26 days ago
Total: 670s
#525533
Pipeline finished with Canceled
26 days ago
Total: 547s
#525561
Comment 26 days ago →
System Message
tedbow → closed merge request !1080
Comment 26 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Pipeline finished with Success
26 days ago
Total: 1761s
#525567
Comment 25 days ago →
🇺🇸United States effulgentsia
Comment 24 days ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
How do I begin reviewing this MR? There's lots of comments here, and the issue summary doesn't talk about the client_id and clientInstanceId this MR is adding.

Can we get an up-to-date issue summary to facilitate a review? And ideally also sprinkle ℹ️ … comments over the MR to make it clearer still 🙏

P.S.: if 📌 Document the reasons for not using Workspaces for saving in 0.2.0 Active had landed, then I'd have asked to update that ADR. In fact … any reason not to do that now, @tedbow? 😇🙏 That'd be even better than an issue summary update!
Comment 24 days ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Pipeline finished with Failed
24 days ago
Total: 618s
#527126
Pipeline finished with Failed
24 days ago
Total: 629s
#527235
Pipeline finished with Failed
24 days ago
Total: 997s
#527250
Pipeline finished with Failed
24 days ago
Total: 849s
#527307
Comment 24 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
re-rolled, added some more code comments and updating the summary but not finished. Leaving assigned to myself as I am working on the summary
Comment 24 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Comment 24 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
updated more.... but still more to update

leaving assigned to myself
Comment 24 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Pipeline finished with Failed
21 days ago
Total: 479s
#529377
Pipeline finished with Failed
21 days ago
Total: 289s
#529385
Pipeline finished with Failed
21 days ago
Total: 718s
#529387
Merge request !1Draft: #3526907 Concurrency control allow PATCH for region without validating other regions/main content → (Closed) created by tedbow
Pipeline finished with Failed
21 days ago
Total: 928s
#529403
Comment 21 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Change the MR so that \Drupal\experience_builder\Controller\ApiLayoutController::patch only validates the region that contains the component being updated.

I need to add test to demonstrate how this works
Pipeline finished with Failed
21 days ago
Total: 777s
#529435
Comment 21 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Added test coverage for regions to \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test already had a test for a conflict with a region
Comment 20 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
@wim leers Re: #31 I have update the summary. I have had a chance to update 📌 Document the reasons for not using Workspaces for saving in 0.2.0 Active but I think this issue can be reviewed. Most of the back-end changes are in \Drupal\experience_builder\Controller\ApiLayoutController::validateAutoSaves. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave is the most explicit test of this. \Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test() also catches ConflictHttpException which is thrown from validateAutoSaves()
Pipeline finished with Failed
20 days ago
Total: 739s
#529753
Comment 20 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Created follow-up 📌 [PP-1] For less conflict errors only validate region auto-save on layout PATCH Postponed , in that issue I mention we might just want to wait until 📌 Replace the postPreview action with atomic equivalents Active . The back-end work for #3532056 would be simple but I am not sure about the front-end changes that would be needed. We would need to send different info from the client but we might need to reload differently.

Also added a todo in the code
Pipeline finished with Failed
20 days ago
Total: 1215s
#530251
Comment 20 days ago →
🇮🇳India nagwani
Pipeline finished with Failed
20 days ago
Total: 1399s
#530279
Pipeline finished with Failed
19 days ago
Total: 272s
#530514
Pipeline finished with Failed
19 days ago
Total: 855s
#530518
Comment 19 days ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Left a review (which I assume is why this was assigned to me)
Unassigning
Comment 18 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
I addressed a couple points in @larowlan's review
Comment 17 days ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
Replied on the issue
There's a few open threads but only one question from me about using ::toArray vs ::normalizeEntity
Comment 13 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Assigning back to @larowlan because I answered his question on the MR
Comment 13 days ago →
🇦🇺Australia larowlan 🇦🇺🏝.au GMT+10
@tedbow's response regarding using the hash makes sense - thanks
Pipeline finished with Failed
13 days ago
Total: 839s
#536360
Pipeline finished with Success
13 days ago
Total: 763s
#536369
Pipeline finished with Success
12 days ago
Total: 929s
#536531
Comment 12 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
adding blocker tag as this blocks 📌 For config entities add client support conflict detection via the `autoSaves' key Active and 🐛 Add rudimentary conflict prevention to the Config Auto-save end-point Active
Comment 12 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
Comment 12 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
I going to RTBC this based on @bnjmnm MR approval and @larowlan saying the backend changes look good
Pipeline finished with Running
12 days ago
#537140
Comment 12 days ago →
🇺🇸United States bnjmnm Ann Arbor, MI
wim leers → credited bnjmnm → .
Comment 12 days ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
I trust Lee & Ben, so going ahead and merging! 🚢
Comment 12 days ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Approved by proxy: since @larowlan +1'd the BE.

Unfortunately:

Rebase failed: Rebase locally, resolve all conflicts, then push the branch. Try again.
@tedbow: can you rebase locally & merge once it's green again? 🙏
Pipeline finished with Skipped
12 days ago
#537367
Comment 12 days ago →
System Message

tedbow → committed 9f99e82f on 0.x authored by jessebaker →
Issue #3526907 by tedbow, jessebaker, larowlan, bnjmnm: For content...
Comment 12 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
🎉
Pipeline finished with Success
12 days ago
Total: 2993s
#537356
Comment 4 days ago →
neha_bawankar
Tested changes on branch 0.x , for following scenarios :
Comment 4 days ago →
🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺
Thanks! 🥳
Comment 3 days ago →
🇺🇸United States tedbow Ithaca, NY, USA
re #55 @neha_bawankar wow thanks for the detailed testing report!!!

[PP-1] Enforce conflict enforcement outside of tests and e2e tests

Overview

Proposed resolution

User interface changes

Merge Requests

!1127[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Merged

!1[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Closed

!1080[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Closed

Comments & Activities

Steps

[PP-1] Enforce conflict enforcement outside of tests and e2e tests

Overview

Proposed resolution

User interface changes

Merge Requests

!1127[PP-1] Enforce conflict enforcement outside of tests and e2e testsMerged

!1[PP-1] Enforce conflict enforcement outside of tests and e2e testsClosed

!1080[PP-1] Enforce conflict enforcement outside of tests and e2e testsClosed

Comments & Activities

Steps

!1127[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Merged

!1[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Closed

!1080[PP-1] Enforce conflict enforcement outside of tests and e2e tests
Closed