- 🇺🇸United States smustgrave
This issue is being reviewed by the kind folks in Slack, #needs-review-queue-initiative. We are working to keep the size of Needs Review queue [2700+ issues] to around 400 (1 month or less), following Review a patch or merge request → as a guide.
Reviewing MR #2839. Hiding files to avoid confusion.
Code wise everything looks fine
But when I ran the tests locally without the fix to DataStorage.php they still passed, when they should fail to show the bug.
Thanks!
- 🇬🇧United Kingdom Driskell
@smustgrave
I just ran the tests without the fix and received a failure. I then applied the fix and it worked again.
However, from a hunch, I hooked up SQLite as the test database, and yes it did not fail!Bit of debugging later...
https://www.sqlite.org/lang_keywords.html
> If a keyword in double quotes (ex: "key" or "glob") is used in a context where it cannot be resolved to an identifier but where a string literal is allowed, then the token is understood to be a string literal instead of an identifier.The query that is meant to fail because of a missing column actually now resolves to a "does this string equal this string" and is accepted as a string expression!
> SELECT 1 FROM "test57957719"."config" WHERE "collection" = :collection AND "name" = :name LIMIT 0, 1Will have a think about a way to make the test fail properly on SQLite as well.
- 🇬🇧United Kingdom Driskell
OK, tracked the quoting to double quotes back to https://www.drupal.org/project/drupal/issues/3126658 →
It seems not to have taken into account SQLite.
I checked and the ANSI quoting in Pgsql and MySQL should be fine as that forces double quote as identifier and it will never be string literal.
In SQLLite it actually is string literal if the column does not exist.I don't know but... I am wondering if the above needs to be a separate issue as that's in a different module. So I'll get a ticket raised.
I've noticed there's some duplication of tests in the MR anyway so I'll push a cleaned up version - and I'll set the other ticket for SQLLite as blocking this one. - Status changed to Postponed
almost 2 years ago 10:50am 21 March 2023 - Status changed to Needs review
almost 2 years ago 10:55am 21 March 2023 - 🇬🇧United Kingdom Driskell
@smustgrave
I'm guessing the impact to all SQLite queries of a change to that would prevent this from being a bug fix in 9.5.
However, perhaps there just needs to be an acceptance that this test won't function properly with SQLite until the other issue is fixed so maybe there's a decision to be made here - in that the test will properly work on MySQL so the automated tests will work fine. It only will cause a problem for running locally with SQLite. Just I feel this patch has saved me many issues and is hugely beneficial - our incidents relating to cache corruption in this area are eliminated so far.
So perhaps the SQLite issue does not postpone this? So I'm U-turning as usual and returning to Needs review to get some thoughts.
- 🇫🇷France andypost
The workaround looks incomplete
tableExists()
making a query on exception, but when other process creating table it could be wrong assumption - 🇬🇧United Kingdom Driskell
@andypost
At the moment it fails silently and corrupts cache during normal day to day running. The scenario of a writer creating the table and causing a valid read to then throw exception is true, but I think that would only occur once, when the table doesn't yet exist, and only in the perfect race between read and the very first ever write to the table, and it would recover completely on the second request and never happen again. Unfortunately it's difficult to quantify the real impact of that scenario but it seems to be small, and much smaller than the current issue's impact.
Technically speaking I think the correct fix is to detect the specific error of the table not existing. However, this might be complex across the different drivers and versions supported to detect the specific error message, and would require driver changes. If that was opted into it would be better I think as a post-task after this to get those driver implementation for throwing a specific exception for table missing, for use in this code, so we're never handling arbitrary exceptions and only handling the exception we intend to: Table does not exist. Arguably that would be easier to test too and wouldn't need the quoting change for SQLite.
What are your thoughts?
- 🇬🇧United Kingdom Driskell
Not seen it before but found an issue that was adding the table exception checks that would also resolve this: https://www.drupal.org/project/drupal/issues/2371709 📌 Move the on-demand-table creation into the database API Needs work
- 🇬🇧United Kingdom catch
The workaround looks incomplete tableExists() making a query on exception, but when other process creating table it could be wrong assumption
This is fine - the worst that happens is a cache miss, and it's how we handle it in other subsystems.
- Status changed to Needs work
almost 2 years ago 9:57pm 21 March 2023 The Needs Review Queue Bot → tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
Apart from a re-roll or rebase, this issue may need more work to address feedback in the issue or MR comments. To progress an issue, incorporate this feedback as part of the process of updating the issue. This helps other contributors to know what is outstanding.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- 🇬🇧United Kingdom longwave UK
I have a busy site that is experiencing occasional strange bugs where fields suddenly no longer exist (causing subsequent fatal errors in different places each time), but this is fixed on cache rebuild, and I am wondering if this is the culprit.
I think the fix here looks good for the problem at hand but I wonder if there is a way of refactoring this so we explicitly create the table separately and then the service itself doesn't have to be concerned with catching exceptions at all; it can just pass them up to the caller. This should be a followup, though.]
- Status changed to Needs review
about 1 year ago 1:37am 10 November 2023 - 🇳🇿New Zealand ericgsmith
I suspect we also hit this issue recently. The symptoms were very similar to #37 - config status showed a list of config as in sync only - clearing the cache went back to showing no differences.
Fixed CS issue reported in #36 and opened new MR to 11.x - setting back to needs review.
- Status changed to Needs work
about 1 year ago 2:49am 10 November 2023 - 🇳🇿New Zealand ericgsmith
Ah - setting back to needs work. As was indicated in #26 the test is failing on SQLite.
Have triggered the additional pipelines for PostgreSQL and SQLite and it indeed is still failing form SQLite https://git.drupalcode.org/issue/drupal-3181013/-/pipelines/4703
- 🇬🇧United Kingdom Driskell
I am putting this back into Needs Review because a failing test is impossible for SQLite due to a known issue (see #3349286). The test system online is MySQL so reproduces it correctly. You cannot test this locally with a SQLite database, it won't fail.
As for the other issues, #2371709 should be a follow on, it's a bit more involved. And hopefully at some point #3349286 can be looked at to fix the SQLite issue. But I don't feel either of these block this fix. The first improved it. The second is purely relating to local tests.
- Status changed to Needs review
about 1 year ago 8:49am 17 November 2023 - 🇳🇿New Zealand ericgsmith
Having also been meaning to come back to say we can almost certainly confirm this issue is what is responsible for the config shown as overridden.
Further investigation showed multiple errors in the logs that mysql had gone away over about 1 minute. We can also see higher than expected numbers of evictions in memcache during this period from a combination of high traffic and our memcache cache bin size being lower than what the site needed. Immediately after the mysql gone away logs we start seeing warning of missing fields that imply the field does not exist, and as mentioned in #40 this was visible via drush config status and resolved after a cache clear.
We are now running this patch in production now along side increasing the cache bin size of memcache.
- 🇬🇧United Kingdom Driskell
I've updated the MR to include a skip for SQLLite and linked it to the issue about the double quoting that prevents the test from working.
- 🇬🇧United Kingdom Driskell
Sorry - that last pushed commit was to fix the URL to be correct issue
- Status changed to Needs work
about 1 year ago 1:07pm 17 November 2023 The Needs Review Queue Bot → tested this issue. It fails the Drupal core commit checks. Therefore, this issue status is now "Needs work".
This does not mean that the patch needs to be re-rolled or the MR rebased. Read the Issue Summary, the issue tags and the latest discussion here to determine what needs to be done.
Consult the Drupal Contributor Guide → to find step-by-step guides for working with issues.
- Status changed to Needs review
about 1 year ago 1:14pm 17 November 2023 - Status changed to Needs work
about 1 year ago 5:13pm 17 November 2023 - 🇺🇸United States smustgrave
Can the 9.5.x MR be closed since D9 is EOL please
The other MR seems to have some test failures potentially.
- 🇬🇧United Kingdom Driskell
@smustgrave I rebased it to 11.x and updated the MR. Not sure how to update the other one so I kept my existing.
It skips the test for SQLite since it will fail due to the linked issue. But it successfully fails without the fix on MariaDB / MySQL and passes after the patch. - Status changed to Needs review
about 1 year ago 10:40pm 17 November 2023 - Status changed to Needs work
about 1 year ago 12:28am 26 November 2023 - 🇺🇸United States smustgrave
Glad switching targets didn't cause issues, usually changing that causes issues.
Checking the issue summary
I am not sure the exact fix yet, but in my opinion, we should not set a permanent faulty cache to the cache backend when there is an exception on getting the data
Isn't a full solution, can the proposed solution be documented helps reviewers/committers.
Added missing sections too, left TBD for sections that may apply but not familiar with issue to know myself. If not applicable just change to NA.
- Status changed to Needs review
about 1 year ago 8:43am 26 November 2023 - 🇬🇧United Kingdom Driskell
@smustgrave I updated issue summary. It was proposed resolution that was taken so I just dropped the not sure bit and I’ve updated other parts to reflect final status. Thanks for checking
- 🇬🇧United Kingdom Driskell
Added more detailed description of problem and solution to IS
- Status changed to RTBC
about 1 year ago 5:58pm 26 November 2023 - last update
about 1 year ago 30,675 pass, 2 fail - Status changed to Fixed
about 1 year ago 11:40am 27 November 2023 - 🇬🇧United Kingdom catch
Committed/pushed to 11.x, cherry-picked to 10.2.x and 10.1.x, thanks!
Automatically closed - issue fixed for 2 weeks with no activity.