- Issue created by @tedbow
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
I brought in the changes from ๐ Add rudimentary conflict prevention to the preview end-point Active because I made progress getting the e2e test to pass, so I thought I would the current state
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
The only 2 e2e tests that are failing now are `entity-form-field-types-test.cy.js` which does pass for me locally and `media-library.cy.js` which did fail locally but I didn't see an 409 request errors in the log. So maybe these are unrelated random test errors? Not sure, retesting
- ๐ง๐ชBelgium wim leers Ghent ๐ง๐ช๐ช๐บ
AFAICT this is also a beta blocker? Otherwise we haven't solved the current auto-save-data-loss scenarios described in ๐ Add rudimentary conflict prevention to the preview end-point Active ?
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
I am actually surprised that no existing e2e tests failed ๐
2 new tests failed
rapid-component-addition.cy.js
: I am not sure this can really be test reliability give the random test fails we have in e2e and this test would solely rely on timing. I am removing itauto-save-conflict.cy.js
: This was attempt to try to simulate 2 users with 2 different browser sessions having a conflict. I am not sure how to do this in cypress and searched around a little and couldn't get it to work. Someone with more experience might have better luck
- ๐ฌ๐งUnited Kingdom jessebaker
I am having a hard time coming up with a robust solution on the client side.
There are two different issues.
The first issue is concurrent editing from two different client instances. That can be either two different users or the same user in two different tabs.
This is solved by the autoSaves hash being passed back and forth and checked and a warning being issued in the instance that a conflict/mismatch arises. โ
The second issue is proving more difficult to tackle. This issue occurs when a single user in a single client is able to trigger the mismatch scenario by making multiple requests in quick succession.
The issue happens in the following scenario:
- The client sends an update to the server with the current autoSaves hash (weโll call it 1).
- While that request is still in flight, the client sends another update to the server. This request also has autoSave hash 1 because the first request has not returned yet with the updated hash.
- The server receives the first request and generates a new autoSave hash (2)
- The response to the first request comes back to the client and the local autoSave hash is updated to 2.
- The server receives the second request (which has autoSave hash 1) and rejects it (because it should be 2!) as 409
- The response to the second request is received by the client as 409 and the user is shown an error.
Potential solutions (to problem 2)
- โLockโ the UI when a request is in flight so that subsequent requests cannot be sent
- Silently discard new requests made when a request is in flight (this is what Claudeโs attempt is doing in !MR1080)
- Mark in flight requests as "aborted" when a subsequent request is made. When an aborted response returns, discard the returned data knowing that the second response is on the way.
- Create a client-side โpendingโ system. While a request is in flight, subsequent data updates to the layout/model are stored as a local variable. When a response is received the local variable is checked and if the variable is set, generate a new request with the data from the variable combined with the autoSaves hash from the most recent response. Then Immediately unset the variable. If multiple data updates are made before the response comes back, they can each just overwrite the local variable as each change is a cumulative data that includes all previous data updates.
- On loading the React app generate a uuid to use as a client instance ID. Alongside the autoSave hash also send the client instance ID to the server. When the server receives consecutive requests with the same client instance ID, bypass the autoSave hash check as we know that they have come from a single instance and thus canโt be conflicting concurrent changes.
I've investigated the above solutions and have the following conclusions:
- This has poor UX implications. If we visually indicate the UI is locked it appears flickery on a good connection. If we just silently ignore user actions it makes the app feel unresponsive
- This allows the server and client to get out of date. The user makes 3 actions but the 2nd and 3rd are silently discarded. It looks like the actions have been made, but on the server only the 1st actually happened.
- I spent a long time implementing this only to find that it doesn't solve the issue. The subsequent request still has the out of data autoSave hash at the point it is dispatched so it still leads to a 409 error.
- I have spent even longer attempting to create this system. After 2 days I've not been able to successfully get this working with RTK query. I DO think it's possible, it's just VERY difficult and is resulting in a large amount of additional complexity
- I have just had this idea and I'm writing this comment on the issue because I'd like to get a second opinion on if its feasible. I think it would work and is less complex than the pending system above.
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
I think there is pontential problem with #11 5)
they have come from a single instance and thus canโt be conflicting concurrent changes.
Is this true? What if the client makes 3 request in sequence and for some reason request 1 takes a long time to connect? Basically request 2 and 3 are processed and then request 1 gets processed. Wouldn't that wipe out changes in 2 and 3? Is what I suggesting even possible, maybe on randomly sluggish networks? Maybe ok to handle this edge case?
- ๐ฆ๐บAustralia larowlan ๐ฆ๐บ๐.au GMT+10
It may be that ๐ Replace the postPreview action with atomic equivalents Active is a better use of effort if the approach here (which was meant to be a short-term quick fix) is not possible.
- ๐ง๐ชBelgium wim leers Ghent ๐ง๐ช๐ช๐บ
+1 for re-assessing ๐ Replace the postPreview action with atomic equivalents Active . It's now >6 months old, but still relevant.
Marking and leaving assigned to @tedbow to convey we should first check that avenue.
- Merge request !1127#3526907 Send timestamp, autoSave hash and unique client id with post and patch... โ (Open) created by jessebaker
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
re #13 and #14 I am currently a -1 on doing ๐ Replace the postPreview action with atomic equivalents Active because
- There is nothing in #3492065 for "User interface changes"(I just copied that part of
๐
Add rudimentary conflict prevention to the preview end-point
Active
to here now)
if we "Enable concurrent editing" there probably will be some very tricky UX problems and we should figure the desired outcomes before we start making the solutionWill concurrent editing just work perfectly and we all agree exactly what that means so there is no describe the UX? ๐
for instance what happens if I am editing props for component, which should show the updates I as type, at the same time another user does another operation that moved that operation at out of view? This would either be by the other user reorder components at the same level as mine or adding another component above it that puts my out of view. Or it could be changing the props of component above mine that changes the height.
- Is there no possibility of conflict still in that system? I understand it would be less possible because you are allowing 2 users to edit different parts of the page. But 2 users could still try to edit the same component where 1 user didn't start with the latest changes that were just auto-saved correct?
If so think a lot of the problems documented in #11 still exist just at a different level down
I think #12 out of sync requests from the same user could still be problem. Especially because we send auto-save requests as you type for single prop of single component
So I think you we wouldn't be saving time but not having to solve #11 it would just be a different level, per component
- ๐ Replace the postPreview action with atomic equivalents Active just seems more complicated so there would be more unknown unknowns. I think we are further along on the current implementation so there would less, less risk
- There is nothing in #3492065 for "User interface changes"(I just copied that part of
๐
Add rudimentary conflict prevention to the preview end-point
Active
to here now)
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
Just wanted to make clear my comments in #16 were all with the idea that we have limited time before Beta1 - re ๐ฑ Milestone 1.0.0-beta1: Start creating non-throwaway sites Active and probably have a lot of other issues. The product decision has been made that we don't need concurrent editing in Beta1. Not sure about GA. but I assume we have a lot of other features that were prioritized in for Beta1
If we had more development resources and there weren't other features that were deemed beta blockers I would be all for getting concurrent editing in before beta1. But also if also if we made that decision won't the first order of business be defining the UX for that?
- ๐ฌ๐งUnited Kingdom jessebaker
@tedbow I've found a bug that I think needs to be addressed server side.
Steps
- Ensure you have no autosaves/pending changes
- Reload the page to ensure you start fresh
- Make a change (e.g. edit a component prop)
- Make another change. Note that these changes both make requests with the correct/latest autoSave hash in the payload
- Publish all changes
- Make a third change. Note that this request correctly sends the last autoSave hash that was received in .4
- The response to this third change is a 409.
I suspect the latest autoSave hash on the server side is either being deleted or updated when changes are published but not communicated back to the FE so the client ends up with an out of date hash.
I think that either a) publishing should not update the hash, or b) the response from a publish action needs to include the updated hash to the FE can store it for the next request.
- First commit to issue fork.
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
@jessebaker I found the problem that is causing #18
It is because during Publish we delete the auto-save entry, including the hashes and the client instance id, for the items that were published. This is because there are no auto-saved information at this point. Everything matches what should have been just saved. We don't want this to show up "Review X changes"
So when the client sends the auto-saved hashes when find a mismatch, client has auto-saved hashes and the server doesn't. Now we could change the server side mechanism to stored this information separate from the actual auto-save information we will need to publish the changes but that would make the server side more complex. Also it is also possible that this particular entity will never be edited within XB again, so this keeps us from having to keep auto-saved hash information for any entity that was ever edited in XB forever.
So I was going to suggest that when submit a publish request and get response from the server that you delete the auto-save hashes. That would solve the problem in #18 but I did make me think of another problem, that actually exists right now
- User 1 open xb/xb_page/1/editor, does make any changes, there is no auto-saved hash
- User 1 leave their browser open
- User 2 opens xb/xb_page/1/editor makes some changes
- User 2 publishes xb_page 1
- User 1, makes a change in their browser session that they never closed
I think the effect would be that User 1 would make an auto-saved version that would not be aware of the published changes and so would wipe out those changes when it was published again.
We could solve this by not deleting the auto-saved entry the server-side, and therefore throwing a conflict error for User 1. But again this would require use to keep auto-save hash information forever for all entities, which could start to add up on larger sites.
So I don't think this is a solution.
I think what we need to do is have the client be aware of the starting point version of the entity it is editing.
So for example for a page or node this would be version number,
vid
orchanged
property.
Basically the server would return to the client, along with the "autosaves" key, a "verison_id" key of the last saved version on the server. The client would always return this key.So in the example above User 1's last request would properly get a 409 error because it's version number would be 1 version number behind. But this would also require the client to request
/xb/api/v0/layout/xb_page/1
after publishing. Though after ๐ [Meta] Selective Publishing and Reverting Active the client could try to be more selective and not request if the current page was not publish it would probably be safer to always just request it.There are probably unrelated reasons why the client should always call
/xb/api/v0/layout/xb_page/1
after publishing. For example Drupal hashook_entity_presave
which allows altering an entity before it is saved. Of course XB will not be aware of any changes made in custom code here, therefore if XB doesn't call/xb/api/v0/layout/xb_page/1
after publishing the auto-save request could wipe out any changes that were made in anyhook_entity_presave
code - ๐บ๐ธUnited States tedbow Ithaca, NY, USA
Note I have not had time to fix the phpunit tests but the e2e test passed so that is good sign. Just to need to update the phpunit tests to expect to a nested structure with revision info from the client.
- ๐ฌ๐งUnited Kingdom jessebaker
1) I think perhaps we should create/address this in a follow up
2) +1 to using 'changed' or even perhaps 'lastModified' and using a timestamp - I could see that information actually being useful elsewhere in the UI at some point.
3) After publishing I think we already refetch some data so I don't see it being a problem to update the page. - ๐ฌ๐งUnited Kingdom jessebaker
There are probably unrelated reasons why the client should always call /xb/api/v0/layout/xb_page/1 after publishing.
As of my latest commit the FE now re-requests the page after publishing changes IF the current page is amongst the list of pending changes.
In my testing there is one remaining bug that I can see that is quite tricky to reproduce.
- In a browser tab start from a blank page, no autosaves (everything published)
- Copy the URL and open a new tab so you have two tabs with the same blank page state.
- In Tab 1, add a Heading component and immediately publish all changes
- Switch to Tab 2.
- Preview is now out of date (showing blank still, but should have a Heading on it)
- Add a Hero component
- There should be a conflict warning but there isn't.
From what I can see, the issue is that the autosave hash has autoSaveRevision but it's always
"1"
. Nothing I do ever seems to increment that number (that is actually a string).{ "autoSaveRevision": "1", "hash": "efa8dd3ce4f5f9e1" }
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
@jessebaker thanks for testing this. I was testing this with nodes and it was working.
Basically entity have a seperate revision id but if we create a new revision on every save this won't increment. I guess Page is not incrementing which I guess makes sense because we don't have a way to revert to old revisions or view old revisions AFIACT
So I used the "changed" time also in the revision for the client
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
changing title to make clear we are only handling content entities, node and pages with layouts
We will need this too ๐ For config entities add client support conflict detection via the `autoSaves' key Active
- ๐ง๐ชBelgium wim leers Ghent ๐ง๐ช๐ช๐บ
How do I begin reviewing this MR? There's lots of comments here, and the issue summary doesn't talk about the
client_id
andclientInstanceId
this MR is adding.Can we get an up-to-date issue summary to facilitate a review? And ideally also sprinkle
โน๏ธ โฆ
comments over the MR to make it clearer still ๐P.S.: if ๐ Document the reasons for not using Workspaces for saving in 0.2.0 Active had landed, then I'd have asked to update that ADR. In fact โฆ any reason not to do that now, @tedbow? ๐๐ That'd be even better than an issue summary update!
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
re-rolled, added some more code comments and updating the summary but not finished. Leaving assigned to myself as I am working on the summary
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
updated more.... but still more to update
leaving assigned to myself
- Merge request !1Draft: #3526907 Concurrency control allow PATCH for region without validating other regions/main content โ (Closed) created by tedbow
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
Change the MR so that
\Drupal\experience_builder\Controller\ApiLayoutController::patch
only validates the region that contains the component being updated.I need to add test to demonstrate how this works
- ๐บ๐ธUnited States tedbow Ithaca, NY, USA
Added test coverage for regions to
\Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave
.\Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test
already had a test for a conflict with a region - ๐บ๐ธUnited States tedbow Ithaca, NY, USA
@wim leers Re: #31 I have update the summary. I have had a chance to update ๐ Document the reasons for not using Workspaces for saving in 0.2.0 Active but I think this issue can be reviewed. Most of the back-end changes are in
\Drupal\experience_builder\Controller\ApiLayoutController::validateAutoSaves
.\Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPostTest::testOutdatedAutoSave
is the most explicit test of this.\Drupal\Tests\experience_builder\Kernel\ApiLayoutControllerPatchTest::test()
also catchesConflictHttpException
which is thrown fromvalidateAutoSaves()