do not query s3 for directory info when cache is disabled

Created on 1 March 2023, almost 2 years ago
Updated 6 December 2023, about 1 year ago

If the ignore cache setting is enabled, getS3fsObject still grabs from the cache for directories, since these are not objects in S3 (only files). The problem is some logic around this code was determining the conditions incorrectly, resulting in a call to S3 even for directories, which then fails. This simple patch I believe applies the logic the way it is intended. It certainly fixed my issue.

πŸ› Bug report
Status

Fixed

Version

3.0

Component

Code

Created by

πŸ‡¨πŸ‡¦Canada Shiraz Dindar Sooke, BC

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Merge Requests

Comments & Activities

  • Issue created by @Shiraz Dindar
  • πŸ‡¨πŸ‡¦Canada Shiraz Dindar Sooke, BC
  • Status changed to Needs work almost 2 years ago
  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    I thinks I saw this while working on the 4.x branch and forgot to back port it to 3.x so it sounds reasonable this might be an issue ( I have not sat down and written the truth table out to verify however)

    I would personally like to keep the ignore_cache check before any others as it is the most likely check to be FALSE as it is unusual and not recommended to have the cache disabled.

    This could use some tests in S3fsMetadataTest::testIgnoreCache()

  • πŸ‡¨πŸ‡¦Canada Shiraz Dindar Sooke, BC

    Hi again, yes, so for some context please see https://www.drupal.org/project/s3fs/issues/3345094#comment-14947120 πŸ“Œ issue with S3fsFileService move method during migrations Postponed: needs info .

    The same applies here, meaning I wouldn't necessarily expect you to roll the patch, but thought I would share to help others.

    That said, patch seems more straight ahead than the last. I looked at the logic there -- if (!empty($this->config['ignore_cache']) && !($metadata && !$metadata['dir'])) -- and indeed how it ANDs with the NOTs in there actually results in the in, if the ignore cache setting is enabled, getS3fsObject still tries to grab directories from S3. ie. not the stated purpose as in the comment.

    I"m reasonably confident that I fixed this in the right way and that you'll want to get this, or a variation of this, committed, but again, not pressure as far as I'm concerned.

    Thanks!
    Shiraz

  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    The request for tests additions is because this is an area of code we have had repeat issues with, I added some tests last time however if there is an issue that indicates the existing tests are not sufficient and that we could have a regression in the future if we don't update them now.

    The desire to have ignore_cache as the first check is because I guesstimate somewhere above 99% of the time this method is called ignore_cache will not be enabled, we can return early based on this without an extra check for if the metadata record, its a small optimization and it allows the code to signal what the most common scenario is to future editors.

  • πŸ‡ΊπŸ‡¦Ukraine deimos

    Patch solves my problem.
    Thank you.

  • First commit to issue fork.
  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    Looking at this the code is nearly impossible to test in the 3.x architecture A the time I had made that request I was deep into 4.x dev where it had been massively refactored and was much easier to mock up. In hindsight I should have noticed this and provided some advice on how we could have moved this forward.

    Given how many times I've personally have made mistakes with this logic I see significant value in making this section testable in 8.x.3.x. In order to do so without massive refactoring I suggest we move the determination logic to a trait where we can unit test this small portion of logic.

    I've opened MR38 with this moved to a trait, and added a test.

    I did a quick test with both MR37 and the previous patch inside the trait after writing the tests and both appeared to still have edge cases associated with the logic where they would either incorrectly categorized a couple of execution paths, or error out due to a missing array key.

    I'm generally not great at Test Driven Design as I often do not see all the variables until I've started developing the methods(possibly because the methods are too complex) however in this case I'm thankful I wrote the test first as I will admit this took a couple of iterations to make pass.

    Thoughts?

  • Status changed to Needs review about 1 year ago
  • πŸ‡ΊπŸ‡ΈUnited States cmlara

    Should have set NR last week.

  • πŸ‡ΊπŸ‡ΈUnited States cmlara
  • Status changed to Fixed about 1 year ago
    • cmlara β†’ committed 8dde12b0 on 8.x-3.x
      Issue #3345097 by smustgrave, cmlara, Shiraz Dindar: do not query s3 for...
  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024