Random test fail in Drupal\Tests\Component\Utility\RandomTest::testRandomMachineNamesUniqueness

Created on 24 July 2023, over 1 year ago
Updated 10 October 2023, about 1 year ago

Problem/Motivation

Test was added by 📌 randomMachineName() should conform to processMachineName() pattern Fixed .


Drupal\Tests\Component\Utility\RandomTest
exception: [Other] Line 0 of sites/default/files/simpletest/phpunit-732.xml:
PHPUnit Test failed to complete; Error: PHPUnit 9.6.8 by Sebastian Bergmann and contributors.

Testing Drupal\Tests\Component\Utility\RandomTest
......E.....                                                      12 / 12 (100%)

Time: 00:00.028, Memory: 4.00 MB

There was 1 error:

1) Drupal\Tests\Component\Utility\RandomTest::testRandomMachineNamesUniqueness
RuntimeException: Unable to generate a unique random machine name

/var/www/html/core/lib/Drupal/Component/Utility/Random.php:169
/var/www/html/core/tests/Drupal/Tests/Component/Utility/RandomTest.php:127
/var/www/html/vendor/phpunit/phpunit/src/Framework/TestResult.php:728
/var/www/html/vendor/phpunit/phpunit/src/Framework/TestSuite.php:684
/var/www/html/vendor/phpunit/phpunit/src/TextUI/TestRunner.php:651
/var/www/html/vendor/phpunit/phpunit/src/TextUI/Command.php:144
/var/www/html/vendor/phpunit/phpunit/src/TextUI/Command.php:97

https://www.drupal.org/pift-ci-job/2723487
https://www.drupal.org/pift-ci-job/2723577

Steps to reproduce

Proposed resolution

Remaining tasks

User interface changes

API changes

Data model changes

Release notes snippet

🐛 Bug report
Status

Fixed

Version

10.2

Component
Other 

Last updated 2 days ago

Created by

🇫🇮Finland lauriii Finland

Live updates comments and jobs are added and updated live.
Sign in to follow issues

Comments & Activities

  • Issue created by @lauriii
  • 🇫🇮Finland lauriii Finland

    This is actually unit test so I tried running this 100000 times locally but it's not failing even with the patch from that CI run 🤔

    OK (1200000 tests, 9800000 assertions)

  • 🇳🇱Netherlands spokje

    Looks like \Drupal\Component\Utility\Random::MAXIMUM_TRIES isn't high enough to get a 100% chance of going through all the letters of the alphabet.

  • last update over 1 year ago
    1 pass, 2 fail
  • 🇧🇪Belgium wim leers Ghent 🇧🇪🇪🇺

    Also spotted this while working on something completely unrelated.

    #5 sounds like a very plausible explanation.

  • 🇳🇱Netherlands spokje

    Ok, this is bound to become a disaster, but here's some maths:

    The changes of picking a certain letter of the alphabet is 1/26, the change of that _not_ happening in 100 times (which is what we see for the missing last machinename for length 1) = (1 - 1/26)^100 = 0.020773381827588

    That's roughly the same what our 5000x run shows with 98 failures/5000 = 0.0196

    So basically currently we have a 2% chance of a (not-so-mathematically) random fail.

    If we up \Drupal\Component\Utility\Random::MAXIMUM_TRIES to 200 we drop down to (1 - 1/26)^200 = 0.00043153339255475.
    That seems like a more workable number to me.

    The question (at least for me) is: Does the test for a 1-letter random machine name actually makes any sense knowing it can always fail and we only can diminish the fail rate.

    Disclaimer: I was sick the day they explained maths at my school...

  • 🇬🇧United Kingdom longwave UK

    testRandomNamesUniqueness() only generates 10 names and as far as I can see we never have a false positive there - should we make the machine name one match this?

  • last update over 1 year ago
    12 pass, 2 fail
  • 🇳🇱Netherlands spokje

    testRandomNamesUniqueness() only generates 10 names and as far as I can see we never have a false positive there, or if we do the failure rate is so low I haven't seen it. Should we make the machine name one match this?

    I assume you meant testRandomNamesUniqueness()?
    At the very least, that doesn't fail in the above 5000x run.

  • 🇳🇱Netherlands spokje

    This seems to be known as the "Coupon collector's problem" to smart people.

    For non-smart people, like myself, the handy graph (https://upload.wikimedia.org/wikipedia/commons/4/46/Coupon_collector_pro...) in the above Wikipedia seems to indicate that for 26 (possible letters in a 1 letter machine name) we need on average 101 attempts to get them all.

    For a 1 number random name (with 10 possible digits), we "only" need 30 tries.

    So that's the explanation why 100 tries with 1 digit is fair, but 100 tries for 26 letters is even below the average, let alone leaving room for being "unlucky"...

  • Status changed to Needs review over 1 year ago
  • last update over 1 year ago
    29,953 pass
  • 🇳🇱Netherlands Lendude Amsterdam

    Interesting reading on "Coupon collector's problem"! I'd gotten as far as "wow, that is actually really hard to calculate" :D

    How about something like this?

    We can more accurately test the expected outcome and calculate the odds and with the bump to 1000 attempts the odds of getting no 'z's are practically none (with 100 tries it is still almost 2%). By running it 25 times we can pretty much eliminate the chance that it just happened to grab the 'z' on the first try.

  • Status changed to Needs work over 1 year ago
  • 🇺🇸United States smustgrave

    There a way we can run #12 5000 times like #9 and see what has the better rate?

  • 🇨🇭Switzerland berdir Switzerland

    Saw this on two different RTBC issues this morning, raising to critical.

  • Status changed to Needs review about 1 year ago
  • last update about 1 year ago
    12 pass, 1 fail
  • 🇬🇧United Kingdom catch

    And again on a branch test. This is #12 5000 times.

    The whole point of this test was to reduce random test failures so if it doesn't work we should skip it for now. Overall I think we need to start switching over to hard-coded machine names that are unique within test classes and reduce use of randomnesss.

  • Status changed to Needs work about 1 year ago
  • 🇬🇧United Kingdom catch
    Testing Drupal\Tests\Component\Utility\RandomTest
    ...........F                                                      12 / 12 (100%)
    
    Time: 00:00.027, Memory: 4.00 MB
    
    There was 1 failure:
    
    1) Drupal\Tests\Component\Utility\RandomTest::testRandomWordValidator
    Failed asserting that 'wrike' is not equal to 'wrike'.
    
    /var/www/html/vendor/phpunit/phpunit/src/Framework/Constraint/Constraint.php:122
    /var/www/html/vendor/phpunit/phpunit/src/Framework/Constraint/Constraint.php:55
    /var/www/html/core/tests/Drupal/Tests/Component/Utility/RandomTest.php:197
    /var/www/html/vendor/phpunit/phpunit/src/Framework/TestResult.php:728
    /var/www/html/vendor/phpunit/phpunit/src/Framework/TestSuite.php:684
    /var/www/html/vendor/phpunit/phpunit/src/TextUI/TestRunner.php:651
    /var/www/html/vendor/phpunit/phpunit/src/TextUI/Command.php:144
    /var/www/html/vendor/phpunit/phpunit/src/TextUI/Command.php:97
    
    FAILURES!
    

    That's a different random test failure.

  • Status changed to Needs review about 1 year ago
  • last update about 1 year ago
    30,388 pass, 2 fail
  • 🇬🇧United Kingdom catch

    Let's skip those two tests.

    I do not see the benefit in debugging the random tests in the tests we added to test the code we modified to try to reduce the amount of random failures we have - we should just stop doing so much random crap everywhere.

  • 🇨🇭Switzerland berdir Switzerland

    > I do not see the benefit in debugging the random test failures in the tests we added to test the code we modified to try to reduce the amount of random failures we have - we should just stop doing so much random crap everywhere.

    +1.

    Random values in tests should IMHO be discouraged unless there is a specific reason for it. It is unlikely to catch real bugs and makes debugging and understanding test failures much harder when you immediately see which bundle/field/thing the test is complaining about.

  • Status changed to Needs work about 1 year ago
  • 🇬🇧United Kingdom longwave UK
    1. +++ b/core/tests/Drupal/Tests/Component/Utility/RandomTest.php
      @@ -121,6 +121,7 @@ public function testRandomStringNonUnique() {
           for ($i = 0; $i <= 25; $i++) {
      

      As I tried to say in #8, here I think we should generate only 10 random machine names instead of 25; if we start generating duplicates for some reason we should start seeing test failures here. I haven't done the calculations but I would assume the chances of getting an accidental repeat when generating only 10 is so small we won't see it in practice.

    2. +++ b/core/tests/Drupal/Tests/Component/Utility/RandomTest.php
      @@ -181,6 +182,7 @@ public function testRandomObject() {
      +    $this->markTestSkipped('Skipped due to random test failures.');
      

      This is skipping the wrong method? The skip is added to testRandomStringValidator but the fail in #17 is in testRandomWordValidator.

      I guess here what has happened is randomly we have managed to generate "wrike" twice in a row. This could still happen by random chance, so the test is invalid. Perhaps we should generate three words, and check that at least two of them are unique? This seems much less likely to happen, although it is still possible.

  • Status changed to Needs review about 1 year ago
  • last update about 1 year ago
    30,392 pass
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    We could choose to remove these tests but going along with @longwave... here's a patch that reduces the loops in testRandomMachineNamesUniqueness() and fixes testRandomWordValidator() to only test that seeding results in the same word being generated.

    FWIW 10 loops is exactly what testRandomNamesUniqueness() has been doing for years without issue.

  • last update about 1 year ago
    12 pass
  • 🇬🇧United Kingdom catch

    Let's run that 5,000 times x 2 then. I am happy with anything that stops the bleeding here, we can rip out randomness in tests later.

    If it fails, I will add patch to remove the two methods, really done with this otherwise.

  • last update about 1 year ago
    12 pass
  • 🇬🇧United Kingdom longwave UK

    testRandomWordValidator() is badly named - it doesn't test the validation callback, whereas testRandomStringValidator() does, and it doesn't need to use the instance variable at all. Also the seed is an implementation detail (Random could choose to use a different RNG than the PHP one, one day) and I'm not sure we should be testing that. But it's also the only test of Random::word().

  • last update about 1 year ago
    30,392 pass
  • last update about 1 year ago
    12 pass
  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    Addressing #25

  • last update about 1 year ago
    12 pass
  • Status changed to RTBC about 1 year ago
  • 🇬🇧United Kingdom catch

    Assuming #26 is green, I'll commit it a bit later. Kicked off an extra test run so we get 10,000.

    We already have #2571183: Deprecate random() usage in tests to avoid random test failures for reducing random usage in tests.

  • 🇬🇧United Kingdom alexpott 🇪🇺🌍

    Assigning contribution credit.

    • catch committed d47b0aff on 10.2.x
      Issue #3376563 by catch, alexpott, Spokje, Lendude, lauriii, longwave,...
    • catch committed e99bc70d on 11.x
      Issue #3376563 by catch, alexpott, Spokje, Lendude, lauriii, longwave,...
  • Status changed to Fixed about 1 year ago
  • 🇬🇧United Kingdom catch

    Committed/pushed to 11.x and cherry-picked to 10.2.x, thanks!

  • Automatically closed - issue fixed for 2 weeks with no activity.

Production build 0.71.5 2024