Tested on 10.1.x-dev with a standard install
module drupal/security_review:2.0.x-dev@dev
branch 3337293-new-checksnamepasswords-does
This branch definitely has better performance and I can run it on my own local system without issues until I get to about 15k users, then timeout occur. Even though there are still timeout issues I think this is a great incremental step and better than the previous code.
I suggest this branch gets merged and maybe we open another issue for something better in the future.
Possible further solutions:
#1 More Queue. Each test is already being processed as a queued item, I'm not sure how we would then create and run another dependent queue as part of this tests execution. If the original test could populate a new queue of users, and then use it's time to process that queue, we could go as slow as we want and scale forever.
#2 Skip all the user loading that costs overhead by leaving authService out of the picture. We could then pull only name/password from db and then jump straight to the important part of the code where the PasswordInterface's check() function is called. We could just call that ourselves and hopefully skip a bunch of intermediary function calls that might incur sql queries of their own.
#3 Skip drupal by pushing the logic all into sql. Since the heart of the password check is an md5 hash comparison (ignoring old style D7 passwords), we can check the existing stored hash against and the username as part of the query directly. This might theoretically be faster but would have a fragile assumption that the md5 in use by the databaser is comparable to the md5 used by the php crypt library.
I was looking at this and don't think I like A) or B), as a utility function I don't think DiffArray should be making changes to form of the inputs and definitely core should not be updated with a hardcoded classname. That seems like maybe normalizing data should be done in StorableConfigBase itself, since it is the one opinionated about only storing arrays. Maybe it should then attempt to do any array conversion/normalization/serialization of objects. Or maybe config_split itself needs to massage everything since it is opinionated about only working with arrays, or maybe layout_builder should make sure all of it's config-able data are arrays to begin with.
smustgrave β credited ednark β .
I have tested this and it functions as expected without errors.
Tested on 10.1.x-dev with a standard install
module drupal/security_review:2.0.x-dev@dev
branch 3008957-refactor-host
Test procedure:
Do not specify any trusted host patterns
run test to see "Trusted hosts are not set"
add trusted_host within main settings.php
run test to see "Trusted hosts are set"
remove trusted_host within main settings.php
add trusted host within secondary file included/required from settings.php
run test to see "Trusted hosts are set"
I have tested this and it functions as expected without errors.
Tested on 10.1.x-dev with a standard install
module drupal/security_review:2.0.x-dev@dev
patch 3221065-9.patch
Test procedure:
start with base 2.0.x module code
add an unreasonable amount of content items via devel
drush devel-generate-content --bundles=page 200001
run test to see Memory error generated
apply patch
run test to see it completes and has no Fields errors
add script tag to any piece of content
run test to see a Field error in the piece of content
add the errored content's reported Hash to the ignore list
run test to see no Field errors
I wanted to try a different approach than the 3221065-7 patch.
Problem: This type of check cannot rely on Drupal's internal dynamically built queries. Drupal Core Database's StatementPrefetch call relies on a full download of all result data from the database. This means there will always be a resource limit imposed by the Drupal host from this one call in core via:
$this->data = $statement->fetchAll(\PDO::FETCH_ASSOC);
Possible Solutions: In this type of situation you either need to run separate small queries per item you want to check, or push the storage of the result set into the database itself and only ask to download one result at at time.
Solution: Running a separate query to load each node in the system seems like a waste so I will try generating one query per field instead and attempt leave the data on the storage server as long as possible. To do this, I will use a static query instead of a dynamic one. A drawback to this method will be that since the table and column names are dynamic we will have to construct a new query for every field we want to check. Normally you should avoid string concatenation when dealing with queries, but in this specific situation all column names are being pulled directly from Drupal in the first place and so they should be safe in a practical sense from any possible corruption, even if we have to use a bad practice.
This patch reorders the loops so the id and data columns are calculated first which allows us to generate a new query for each field needing to be checked. We then also need to be sure to unset the $query variable each loop to free any temporary memory reserved by the underlying libraries.